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Preface 


A  book  collecting  the  celebrated  problems  of  elementary  mathematics  that 
would  commemorate  their  origin  and,  above  all,  present  their  solutions  briefly, 
clearly,  and  comprehensibly  has  long  seemed  a  necessary  and  attractive  task  to 
the  author. 

The  restriction  to  problems  of  elementary  mathematics  was  considered 
advisable  in  view  of  those  readers  who  have  neither  the  time  nor  the  opportunity 
to  acquaint  themselves  in  any  detail  with  higher  mathematics.  Nevertheless,  in 
spite  of  this  limitation  a  colorful  and  compelling  picture  has  emerged,  one  that 
gives  an  idea  of  the  amazing  variety  of  mathematical  methods  and  one  that  will 
— 1  hope — enchant  many  who  are  interested  in  mathematics  and  who  take 
pleasure  in  characteristic  mathematical  thought  processes.  In  the  present  work 
there  are  to  be  found  many  pearls  of  mathematical  art,  problems  the  solutions  of 
which  represent,  in  the  achievements  of  a  Gauss,  an  Euler,  Steiner,  and  others, 
incredible  triumphs  of  the  mathematical  mind. 

Because  the  difficult  economic  situation  at  the  present  time  barred  the 
publication  of  a  larger  work,  a  limit  had  to  be  set  to  the  scope  and  number  of  the 
problems  treated.  Thus,  I  decided  on  a  round  number  of  one  hundred  problems. 
Moreover,  since  many  of  the  problems  and  solutions  require  considerable  space 
despite  the  greatest  concision,  this  had  to  be  compensated  for  by  the  inclusion  of 
a  number  of  mathematical  miniatures.  Possibly,  however,  it  may  be  just  these 
little  problems,  which  are,  in  their  way,  true  jewels  of  mathematical  miniature 
work,  that  will  find  the  readiest  readers  and  win  new  admirers  for  the  queen  of 
the  sciences. 

As  we  have  indicated  already,  a  knowledge  of  higher  analysis  is  not  assumed. 
Consequently,  the  Taylor  expansion  could  not  be  used  for  the  treatment  of  the 
important  infinite  series.  I  hope  nonetheless  that  the  derivations  we  have  given, 
particularly  the  striking  derivation  of  the  sine  and  cosine  series,  will  please  and 
will  not  be  found  unattractive  even  by  mathematically  sophisticated  readers. 

On  the  other  hand,  in  some  of  the  problems,  e.g.,  the  Euler  tetrahedron 
problem  and  the  problem  of  skew  lines,  the  author  believed  it  necessary  not  to 
dispense  with  the  simplest  concepts  of  vector  analysis.  The  characteristic 
advantages  of  brevity  and  elegance  of  the  vector  method  are  so  obvious,  and  the 
time  and  effort  required  for  mastering  it  so  slight,  that  the  vectorial  methods 
presented  here  will  undoubtedly  spur  many  readers  on  to  look  into  this  attractive 


area. 

For  the  rest,  only  the  theorems  of  elementary  mathematics  are  assumed  to  be 
known,  so  that  the  reading  of  the  book  will  not  entail  significant  difficulties.  In 
this  connection  the  inclusion  of  the  little  problems  may  in  fact  increase  the 
acceptability  of  the  book,  in  that  it  will  perhaps  lead  the  mathematically  weaker 
readers,  after  completion  of  the  simpler  problems,  to  risk  the  more  difficult  ones 
as  well. 

So  then,  let  the  book  go  out  and  do  its  part  to  awaken  and  spread  the  interest 
and  pleasure  in  mathematical  thought. 

Wiesbaden,  Heinrich  Dorrie 

Fall,  1932 


Preface  to  the  Second  Edition 

The  second  edition  of  the  book  contains  few  changes.  An  insufficiency  in  the 
proof  of  the  Fermat-Gauss  Impossibility  Theorem  has  been  eliminated,  Problem 
94  has  been  placed  in  historical  perspective  and  the  Problem  of  the  Length  of  the 
Polar  Night,  which  in  relation  to  the  other  problems  was  of  less  significance,  has 
been  replaced  by  a  problem  of  a  higher  level:  “Andre’s  Derivation  of  the  Secant 
and  Tangent  Series.” 


Wiesbaden, 
Spring,  1940 


Heinrich  Dorrie 
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1 


Archimedes’  Problema  Bovinum 


The  sun  god  had  a  herd  of  cattle  consisting  of  bulls  and  cows,  one  part  oj 
which  was  white,  a  second  black,  a  third  spotted,  and  a  fourth  brown. 

Among  the  bulls,  the  number  of  white  ones  was  one  half  plus  one  third  the 
number  of  the  black  greater  than  the  brown;  the  number  of  the  black,  one 
quarter  plus  one  fifth  the  number  of  the  spotted  greater  than  the  brown;  the 
number  of  the  spotted,  one  sixth  and  one  seventh  the  number  of  the  white  greater 
than  the  brown. 

Among  the  cows,  the  number  of  white  ones  was  one  third  plus  one  quarter  oj 
the  total  black  cattle;  the  number  of  the  black,  one  quarter  plus  one  fifth  the  total 
of  the  spotted  cattle;  the  number  of  the  spotted,  one  fifth  plus  one  sixth  the  total 
of  the  brown  cattle;  the  number  of  the  brown,  one  sixth  plus  one  seventh  the  total 
of  the  white  cattle. 

What  was  the  composition  of  the  herd? 

Solution.  If  we  use  the  letters  X,  Y,  Z,  T  to  designate  the  respective  number 
of  the  white,  black,  spotted,  and  brown  bulls  and  x,  y,  z,  t  to  designate  the  white, 
black,  spotted,  and  brown  cows,  we  obtain  the  following  seven  equations  for 
these  eight  unknowns: 

(1)  X-  T  =  iY,  (4)  *- A(y  +  y), 

(2)  Y-T  =  ftZ,  (5)  y  =  *(Z  +  z), 

(3)  Z-T=iiX,  (6)  z  =  U(r+o. 

(7)  <  =  mx+x). 

From  equations  (1),  (2),  (3)  we  obtain  6 X-5Y=  6T,  20 Y -  9 Z  =  20  T,  42 Z  - 
13X=  42  T,  and  taking  these  three  equations  as  equations  for  the  three  unknowns 
X,  Y,  and  Z,  we  find 

X=WtT>  Y  =  WT,  Z  =  W?T. 

Since  891  and  1580  possess  no  common  factors,  T  must  be  some  whole 
multiple — let  us  say  G — of  891.  Consequently, 

(I)  X  =  2226G,  Y  =  1602G,  Z  =  1580G,  T  =  891G. 

If  these  values  are  substituted  into  equations  (4),  (5),  (6),  (7),  the  following 


equations  are  obtained: 


\2x  -7 y  =  11214G,  20y  -  9z  =  14220 G, 

30 z  -  lit  =  9801 G,  42/  -  13*  =  28938G. 

These  equations  are  solved  for  the  four  unknowns  x,  y,  z,  t  and  we  obtain 

(cx  =  7206360G,  cy  =  4893246C, 

\cz  =  35I5820G,  ct  =  54392 13G, 


in  which  c  is  the  prime  number  4657.  Since  none  of  the  coefficients  of  G  on  the 
right  can  be  divided  by  c,  then  G  must  be  an  integral  multiple  of  c : 

G  =  eg. 


If  this  value  of  G  is  introduced  into  (I)  and  (II),  we  finally  obtain  the  following 
relationships: 


(I') 

(II') 


{ 

! 


X  =  10366482$, 
Z  =  7358060$, 

x  =  7206360$, 
z  =  3515820$, 


Y  =  7460514$, 
T  =  4149387$, 

y  =  4893246$, 
/  =  5439213$, 


where  g  may  be  any  positive  integer. 

The  problem  therefore  has  an  infinite  number  of  solutions.  Ifg  is  assigned  the 
value  1,  we  obtain  the  following: 


Solution  in  the  Smallest  Numbers 


white  bulls  10,366,482 
black  bulls  7,460,514 
spotted  bulls  7,358,060 
brown  bulls  4,149,387 


white  cows  7,206,360 
black  cows  4,893,246 
spotted  cows  3,515,820 
brown  cows  5,439,213 


Historical.  As  the  above  solution  shows,  the  problem  of  the  cattle  cannot 
properly  be  considered  a  very  difficult  problem,  at  least  in  terms  of  present 
concepts.  Since,  however,  in  ancient  times  a  difficult  problem  was  frequently 
referred  to  specifically  as  a  problema  bovinum  or  else  as  a  problema  Archimedis, 
one  may  assume  that  the  form  of  the  problem  dealt  with  above  does  not 
represent  the  complete  and  original  form  of  Archimedes’  problem,  especially 


when  one  considers  the  rest  of  Archimedes’  brilliant  achievements,  as  well  as  the 
fact  that  Archimedes  dedicated  the  cattle  problem  to  the  Alexandrian  astronomer 
Eratosthenes. 

A  “more  complete”  formulation  of  the  problem  is  contained  in  a  manuscript 
(in  Greek)  discovered  by  Gotthold  Ephraim  Lessing  in  the  Wolfenbiittel  library 
in  1773.  Here  the  problem  is  posed  in  the  following  poetic  form,  made  up  of 
twenty-two  distichs,  or  pairs  of  verses: 

Number  the  sun  god’s  cattle,  my  friend,  with  perfect  precision. 

Reckon  them  up  with  great  care,  if  any  wisdom  you’d  claim: 

How  many  cattle  were  there  that  once  did  graze  in  the  meadows 
On  the  Sicilian  isle,  sorted  by  herds  into  four, 

Each  of  these  four  herds  differently  colored:  the  first  herd  was  milk-white, 
Whereas  the  second  gleamed  in  a  deep  ebony  black. 

Brown  was  the  third  group,  the  fourth  was  spotted;  in  every  division 
Bulls  of  respective  hues  greatly  outnumbered  the  cows. 

Now,  these  were  the  proportions  among  the  cattle:  the  white  ones 
Equaled  the  number  of  brown,  adding  to  that  the  third  part 
Plus  one  half  of  the  ebony  cattle  all  taken  together. 

Further,  the  group  of  the  black  equaled  one  fourth  of  the  flecked 
Plus  one  fifth  of  them,  taken  along  with  the  total  of  brown  ones. 

Finally,  you  must  assume,  friend,  that  the  total  with  spots 
Equaled  a  sixth  plus  a  seventh  part  of  the  herd  of  white  cattle, 

Adding  to  that  the  entire  herd  of  the  brown-colored  kine. 

Yet  quite  different  proportions  held  for  the  female  contingent: 

Cows  with  white-colored  hair  equaled  in  number  one  third 
Plus  one  fourth  of  the  black-hued  cattle,  the  males  and  the  females. 

Further,  the  cows  colored  black  totaled  in  number  one  fourth 
Plus  one  fifth  of  the  whole  spotted  herd,  in  this  computation 
Counting  in  each  spotted  cow,  each  spotted  bull  in  the  group. 

Likewise,  the  spotted  cows  comprised  the  fifth  and  the  sixth  part 
Out  of  the  total  of  brown  cattle  that  went  out  to  graze. 

Lastly,  the  cows  colored  brown  made  up  a  sixth  and  a  seventh 
Out  of  the  white-coated  herd,  female  and  male  ones  alike. 

If,  my  friend,  you  can  tell  me  exactly  what  was  the  number 
Gathered  together  there  then,  also  the  accurate  count 


Color  by  color  of  every  well-nourished  male  and  each  female, 

Then  with  right  you’ll  be  called  skillful  in  keeping  accounts. 

But  you  will  not  be  reckoned  a  wise  man  yet;  if  you  would  be, 

Come  and  answer  me  this,  using  new  data  I  give  : 

When  the  entire  aggregation  of  white  bulls  and  that  of  the  black  bulls 
Joined  together,  they  all  made  a  formation  that  was 
Equally  broad  and  deep;  the  far-flung  Sicilian  meadows 

Now  were  thoroughly  filled,  covered  by  great  crowds  of  bulls. 

But  when  the  brown  and  the  spotted  bulls  were  assembled  together, 

Then  was  a  triangle  formed;  one  bull  stood  at  the  tip  ; 

None  of  the  brown-colored  bulls  was  missing,  none  of  the  spotted, 

Nor  was  there  one  to  be  found  different  in  color  from  these. 

If  this,  too,  you  discover  and  grasp  it  well  in  your  thinking, 

If,  my  friend,  you  supply  every  herd’s  make-up  and  count, 

Then  with  justice  proclaim  yourself  victor  and  march  about  proudly,  Z 
For  your  fame  will  glow  bright  all  through  the  world  of  the  wise. 

Lessing,  however,  disputed  the  authorship  of  Archimedes.  So  also  did 
Nesselmann  {Algebra  der  Griechen,  1 842),  the  French  writer  Vincent  ( Nouvelles 
Annales  de  Mathematiques,  vol.  XV,  1856),  the  Englishman  Rouse  Ball  (A  Short 
Account  of  the  History  of  Mathematics ),  and  others. 

The  distinguished  Danish  authority  on  Archimedes  J.  L.  Heiberg 
{Quaestiones  Archimedeae ),  the  French  mathematician  R  Tannery  {Sciences 
exactes  dans  I’antiquite ),  as  well  as  Krummbiegel  and  Amthor  {Schlomilchs 
Zeitschrift  fur  Mathematik  und  Physik,  vol.  XXV,  1880),  on  the  other  hand,  are 
of  the  opinion  that  this  complete  form  of  the  problem  is  to  be  attributed  to 
Archimedes. 

The  two  conditions  set  forth  in  the  last  seven  distichs  require  that  X  +  Y  be  a 
square  number  U 2  and  Z+Ta  triangular  number*  ±V{V  +  1)  as  a  result  of  which 
we  obtain  the  following  relations: 

(8)  X  +  Y  =  U2  and  (9)  2Z  +  2T  =  V2  +  V. 

If  we  substitute  in  (8)  and  (9)  the  values  X,  Y,  Z,  T  in  accordance  with  (I), 
these  equations  are  transformed  into 

38286’  -  U2  and  49426  =  V2  +  V. 


If  we  replace  3828,  4942,  and  G,  respectively,  with  4 a  ( a  being  equal  to  3  •  11 
•  29  =  957),  b,  and  eg,  we  obtain 

(8')  U2  =  4  aeg,  (9')  V2  +  V  =  beg. 

U  is  consequently  an  integral  multiple  of  2,  a,  and  c: 

U  =  2  acuy 

so  that 

U2  —  4  a2c2u2  =  4  aeg 
and 

(8*)  g  -  acu2. 

If  this  value  for  g  is  introduced  into  ( 9' )  we  obtain 

V2  +  V  =  abc2u2 


or 


(2F  +  l)2  =  4  abc2u2  +  1. 

If  the  unknown  is  designated  as  2V  +  lv  and  the  product  4 abc2  =  4  ■  3  •  11  • 
29  •  2  •  7  •  353  •  465 72  is  abbreviated  as  d,  the  last  equation  is  transformed  into 

v2  —  du2  =  1. 

This  is  a  so-called  Fermat  equation,  which  can  be  solved  in  the  manner  described 
in  Problem  19.  The  solution  is,  however,  extremely  difficult  because  d  has  the 
inconveniently  large  value 


d  =  410286423278424 

and  even  the  smallest  solution  for  u  and  v  of  this  Fermat  equation  leads  to 
astronomical  figures. 

Even  if  u  is  assigned  the  smallest  conceivable  value  1,  in  solving  for  g  the 
value  of  ac  is  4456749  and  the  combined  number  of  white  and  black  bulls  is 


over  79  billion.  However,  since  the  island  of  Sicily  has  an  area  of  only  25500 
km2  =  0.0255  billion  m2,  i.e.,  less  than  billion  m2,  it  would  be  quite  impossible 
to  place  that  many  bulls  on  the  island,  which  contradicts  the  assertion  of  the 
seventeenth  and  eighteenth  distichs. 


2 


The  Weight  Problem  of  Bachet  de  Meziriac 


A  merchant  had  a  forty-pound  measuring  weight  that  broke  into  four  pieces 
as  the  result  of  a  fall.  When  the  pieces  were  subsequently  weighed ,  it  was  found 
that  the  weight  of  each  piece  was  a  whole  number  of  pounds  and  that  the  four 
pieces  could  be  used  to  weigh  every  integral  weight  between  1  and  40  pounds. 

What  were  the  weights  of  the  pieces? 

This  problem  stems  from  the  French  mathematician  Claude  Gaspard  Bachet 
de  Meziriac  (1581-1638),  who  solved  it  in  his  famous  book  Problemes  plaisants 
et  delectables  qui  se  font  par  les  nombres,  published  in  1624. 

We  can  distinguish  the  two  scales  of  the  balance  as  the  weight  scale  and  the 
load  scale.  On  the  former  we  will  place  only  pieces  of  the  measuring  weight; 
whereas  on  the  load  scale  we  will  place  the  load  and  any  additional  measuring 
weights.  If  we  are  to  make  do  with  as  few  measuring  weights  as  possible  it  will 
be  necessary  to  place  measuring  weights  on  the  load  scale  as  well.  For  example, 
in  order  to  weigh  one  pound  with  a  two-pound  and  a  three-pound  piece,  we  place 
the  two-pound  piece  on  the  load  scale  and  the  three-pound  piece  on  the  weight 
scale. 

If  we  single  out  several  from  among  any  number  of  weights  lying  on  the 
scales,  e.g.,  two  pieces  weighing  5  and  10  lbs  each  on  one  scale,  and  three  pieces 
weighing  1,3,  and  4  lbs  each  on  the  other,  we  say  that  these  pieces  give  the  first 
scale  a  preponderance  of  7  lbs. 

We  will  consider  only  integral  loads  and  measuring  weights,  i.e.,  loads  and 
weights  weighing  a  whole  number  of  pounds. 

If  we  have  a  series  of  measuring  weights  A,  B,  C,  ...,  which  when  properly 
distributed  upon  the  scales  enable  us  to  weigh  all  the  integral  loads  from  1 
through  n  lbs,  and  if  P  is  a  new  measuring  weight  of  such  nature  that  its  weight  p 
exceeds  the  total  weight  n  of  the  old  measuring  weights  by  1  more  than  that  total 
weight : 


p  —  n  =  n  +  1  or  p  =  2n  +  \, 


it  is  then  possible  to  weigh  all  integral  loads  from  1  through  p  +  n  =  3n  +  1  by 
addition  of  the  weight  P  to  the  measuring  weights  A,  B,  C,  .... 

In  fact,  the  old  pieces  are  sufficient  to  weigh  all  loads  from  1  to  n  lbs.  In 
order  to  weigh  a  load  of  (p  +  x)  and/or  (p-x)  lbs,  where  x  is  one  of  the  numbers 
from  1  to  n,  we  place  the  measuring  weight  P  on  the  weight  scale  and  place 
weights  A,  B,  C,  ...  on  the  scales  in  such  a  manner  that  these  pieces  give  either 
the  weight  scale  or  the  load  scale  a  preponderance  of  x  lbs. 

This  being  established,  the  solution  of  the  problem  is  easy. 

In  order  to  carry  out  the  maximum  possible  number  of  weighings  with  two 
measuring  weights,  A  and  B,  A  must  weigh  1  lb  and  B  3  lbs.  These  two  pieces 
enable  us  to  weigh  loads  of  1,  2,  3,  4  lbs. 

If  we  then  choose  a  third  piece  C  such  that  its  weight 


c  =  2  •  4  +  1  =9  lbs, 

it  then  becomes  possible  to  use  the  three  pieces  A,  B,  C  to  weigh  all  integral 
loads  from  1  to  c  +  4  =  9  +  4=  13. 

Finally,  if  we  choose  a  fourth  piece  D  such  that  its  weight 


d  =  2- 13  +  1  =  27  lbs, 


the  four  weights  A,  B,  C,  D  then  enable  us  to  weigh  all  loads  from  1  to  27  +  13  = 
40  lbs. 

Conclusion.  The  four  pieces  weigh  1,  3,  9,  27  lbs. 

Note.  Bachet’s  weight  problem  was  generalized  by  the  English 
mathematician  MacMahon.  In  Volume  21  of  the  Quarterly  Journal  oj 
Mathematics  (1886)  MacMahon  determined  all  the  conceivable  sets  of  integral 
weights  with  which  all  loads  of  1  to  n  lbs  can  be  weighed. 
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Newton’s  Problem  of  the  Fields  and  Cows 


In  Newton’s  Arithmetica  universalis  (1707)  the  following  interesting  problem 
is  posed: 


a  cows  graze  b  fields  bare  in  c  days, 
a'  cows  graze  b'  fields  bare  in  c'  days , 
a'  cows  graze  b'  fields  bare  in  c*  days; 


what  relation  exists  between  the  nine  magnitudes  a  to  c"? 

It  is  assumed  that  all  the  fields  provide  the  same  amount  of  grass,  that  the 
daily  growth  of  the  fields  remains  constant,  and  that  all  the  cows  eat  the  same 
amount  each  day. 

Solution.  Let  the  initial  amount  of  grass  contained  by  each  field  be  M,  the 
daily  growth  of  each  field  m,  and  the  daily  grass  consumption  of  each  cow  Q. 

On  the  evening  of  the  first  day  the  amount  of  grass  remaining  in  each  field  is 

bM  4 ■  bm  —  aQ, 

on  the  evening  of  the  second  day 

bM  +  2  bm  —  2  aQ, 


on  the  evening  of  the  third  day 


bM  +  3  bm  —  3  aQ, 

etc.,  so  that  on  the  evening  of  the  cth  day 


bM  +  cbm  —  caQ. 

And  this  value  must  be  equal  to  zero,  since  the  fields  are  grazed  bare  in  c  days. 
This  gives  rise  to  the  equation 

(1)  bM  +  cbm  =  caQ. 

In  like  manner  the  following  equations  are  obtained: 


(2) 

b'M  +  cb'm  =  c'a'Q 

and 

(3) 

b"M  +  c"bHm  =  cVQ. 

If  (1)  and  (2)  are  taken  as  linear  equations  for  the  unknowns  M  and  m,  we 
obtain 


M  = 


cc'(ab'  -  ba) 
bb'(c'  -  c) 


Q, 


m 


be' a'  —  b'ca 
bb'(c'  -  c)  * 


If  these  values  are  introduced  into  equation  (3)  and  the  resulting  equation  is 
multiplied  by  [hb'(c' -  c)]/Q,  we  obtain  the  desired  relation: 

b'cc'{ab'  -  ba')  +  c'b'fbc'a'  -  b'ca)  =  c'a"bb'{c’  -  c). 

The  solution  is  more  easily  seen  when  expressed  in  the  form  of  determinants. 
If  q  represents  the  reciprocal  of  Q,  equations  (1),  (2),  (3)  assume  the  form 

bM  +  cbm  +  caq  =  0, 
b'M  +  c'b'm  +  c'a'q  =  0, 
b"M  +  +  cHattq  =  0. 


According  to  one  of  the  basic  theorems  of  determinant  theory,  the 
determinant  of  a  system  of  n  (3  in  this  case)  linear  homogeneous  equations 
possessing  n  unknowns  that  do  not  all  vanish  (M,  m,  q  in  this  case)  must  be  equal 
to  zero.  Consequently,  the  desired  relation  has  the  form 


b  be  ca 

b'  b'c'  c'a' 

b"  b'c'  c'a' 


=  0. 


Berwick’s  Problem  of  the  Seven  Sevens 


In  the  following  division  example,  in  which  the  divisor  goes  into  the  dividend 
without  a  remainder: 


**7*******:****7*  =  **7** 
****** 

*****7* 

******* 

*7**** 

*7**** 

******* 

****7** 

****** 

****** 


the  numbers  that  occupied  the  places  marked  with  the  asterisks  (*)  were 
accidentally  erased.  What  are  the  missing  numbers? 

This  remarkable  problem  comes  from  the  English  mathematician  E.  H. 
Berwick,  who  published  it  in  1906  in  the  periodical  The  School  World. 

Solution.  We  will  assign  a  separate  letter  to  each  of  the  missing  numerals. 
The  example  then  has  the  following  appearance: 


AB  7  CDE  LQWz: 
a  b  A  c  d  e 

FGH  IK  1  L 
f g h  i  kZ  l 

M  7  NO PQ 
m  7  n  o  p  q 

rJtUYvw 

T  S  t  U  7  V  w 

X YZx y z 
XYZx y z 


afiyhle  =  k\1  pv 

Third  line 
Fourth  line 

Fifth  line 
7-b 

Seventh  line 
Ninth  line 


I.  The  first  numeral  (a)  of  the  divisor  b  must  be  1,  since  7b,  as  the  sixth  line  of 
the  example  shows,  possesses  six  numerals,  whereas  if  a  equaled  2,  7b  would 
possess  seven  numerals. 

Since  the  remainders  in  the  third  and  seventh  lines  possess  six  numerals,  F 
must  equal  1  and  R  must  equal  1,  as  a  result  of  which  /  and  r  must  also  equal  1 
(according  to  the  outline). 

Since  b  cannot  exceed  199979,  the  maximum  value  of  p  is  9,  so  that  the 
product  in  the  eighth  line  cannot  exceed  1799811,  and  s  <  8.  And  since  S  can 


only  be  9  or  0,  and  since  there  is  no  remainder  in  the  ninth  line  under  s,  only  the 
second  case  is  possible.  Consequently,  S'  =  0  and  (since  R  =  1)  s  is  also  equal  to 
0.  It  also  follows  from  R  =  1  and  S  =  0  that  M  =  m  +  1,  thus  m  s  8,  and  the 
product  7b  of  the  sixth  line  cannot  be  higher  than  %lnopq. 

II.  Consequently,  the  only  possible  values  for  the  second  divisor  numeral  /? 
are  0,  1,  and  2.  (7  •  130000  is  already  higher  than  900000.)  /?  =  0  is  eliminated 
because  even  when  multiplied  by  nine  109979  does  not  give  a  seven-figure 
number,  which,  for  example,  is  required  by  the  eighth  line. 

Let  us  then  consider  the  case  of  /?  =  1 .  This  requires  y  to  be  equal  to  only  0  or 
1 .  (If  y  >  2,  on  determination  of  the  second  figure  of  line  6  one  would  have  to 
add  to  7yS  =  7  •  1  =  7  the  amount  1  coming  from  the  product  7  •  y,  whereas  the 
second  figure  must  be  7.) 

y  =  0,  however,  is  impossible  as  a  result  of  the  seven  figures  of  line  8,  since 
not  even  9  •  110979  yields  a  seven-figure  product. 

In  the  event  that  y  =  1  the  following  conditions  must  be  observed,  as  a  glance 
at  line  8  will  show:  S,  s,  and  p  must  be  so  chosen  that  p  ■  Wldle  results  in  a 
seven-place  number,  the  third  last  figure  of  which  is  7.  The  only  hope  of  this  is 
offered  by  the  multiplier  p  =  9  (since  even  8  •  111979  has  only  six  places).  Now 
the  third  last  figure  of  9  •  11  Idle,  as  is  easily  seen  by  experiment,  can  be  a  seven 
only  if  S  =  0  or  S  =  9.  In  the  first  case  line  8  will  not  possess  seven  places  even 
when  111079  is  multiplied  by  9,  and  in  the  second  case  line  6  is  7  •  11197*  = 
783***,  which  is  impossible.  Thus,  the  case  of  y  =  1  is  also  excluded.  The 
possibility  of  equaling  I  must,  therefore,  be  discarded. 

The  only  appropriate  value  for  the  second  figure  of  the  divisor  is  therefore  (J> 
=  2.  From  this  it  follows  that  m  =  8  and  M=  9. 

III.  The  third  figure  y  of  the  divisor  can  only  be  4  or  5,  since  7  •  126000  is 
greater  and  7  •  124000  is  smaller  than  the  sixth  line.  Moreover,  since  9  •  124000 
is  greater  and  7  •  126000  is  smaller  than  the  eighth  line  (10 tulvw),  p  must  be 
equal  to  8. 

Since  8  •  124979  =  999832  <  1000000  the  assumption  that  y  =  4  fails  to 
satisfy  the  requirements  of  line  8,  and  y  therefore  has  to  be  equal  to  5. 

IV.  Since  the  third  last  figure  of  8  •  12587c  must  be  7,  we  find  by  testing  that  5 
is  equal  to  either  4  or  9.  3  =  9  is  eliminated  because  even  7  •  125970  =  881790 
comes  out  greater  than  the  sixth  line,  so  that  only  3  =  4  is  suitable.  Thus,  e  can  be 
considered  one  of  numbers  0  to  4.  However,  whichever  one  of  these  is  chosen, 
we  find  for  the  third  figure  of  the  sixth  line  n  =  8  from  7  •  12547c  =  878***. 


Similarly,  for  the  eighth  line  we  obtain  8  •  12547s  =  10037**,  and  consequently  t 
=  0  and  u  =  3. 

Since  2b  =  2  •  12547s  results  in  a  seven-place  fourth  line  and  only  8b  and  9b 
have  seven  places,  2  is  either  8  or  9. 

V.  From  t  =  0  and  X  \  (together  with  R  =  r=\,S  =  s  =  0)it  follows  that  T 
1,  and  from  n  =  8,  N  £  9,  it  follows  that  T  ■$.  1,  so  that  T=  1 .  TV  is  therefore  equal 
to  9  and  X  =  1 .  Since  X  =  1  and  2  •  b  >  200000  (line  9),  it  follows  that  v  =  1  and 
also  that  Y  =  2,  Z  =  5,  x  =  4,  y  =  7,  and  z  =  s.  With  the  results  obtained  at  this 
point  the  problem  has  the  following  appearance: 

ABICDELQWe:  12547c  =  /cA781 
a  b  A  c  d  e 

\GH  IK  7  L 
1  g  h  i  k  S  / 

9  7  90  PQ 
8  7  8  o  p  q 

1  0  1  UY.VW 
1  0  0  3  7  v  w 

1  2  5  4  7  e 
1  2  5  4  7  * 


VI.  In  this  case  c  is  one  of  the  five  numbers 

0,  1,  2,  3,  4. 

These  five  cases  correspond  to  the  number  series 

vw  =  60,  68,  76,  84,  92, 
opq  =  290,  297,  304,  311,  318 

and,  depending  upon  whether  2  is  equal  to  8  or  9, 

El  =  60,  68,  76,  84,  92 


or 


El  =  30,  39,  48,  57,  66. 


This  presents  ten  different  possibilities.  If  we  test  each  of  them  by  going  upward 
in  three  successive  additions  beginning  from  lines  9  and  8  to  line  7,  then  from 
lines  7  and  6  to  line  5,  and  finally  from  lines  5  and  4  to  line  3,  we  find  that  only 
when  e  =  3  and  X  =  8  do  we  obtain  the  requisite  7  for  the  next  to  last  figure  of 
line  3.  In  this  case  vw  =  84,  ITfVW  =  6331,  opq  =  311,  OPQ  =  944, 
ghikZl  =  003784,  and  GHIK7L  =  101778.  This  gives  the  problem  the  following 
appearance; 


A  B  7  CD  E  8  4  1  3: 125473  =  *8781 
a  b  A  c  d  e 

110  17  7  8 
1  0  0  3  7  8  4 

9  7  9  9  4  4 
8  7  8  3  1  1 

1  0  1  6  3  3  1 
1  0  0  3  7  8  4 

1  2  5  4  7  3 
1  2  5  4  7  3 


VII.  Finally,  since  of  all  the  multiples  of  b  only  5b  =  627365  added  to  the 
division  remainder  110177  of  the  third  line  gives  a  number  containing  a  7  in  the 
third  place,  we  get  k  =  5  and  at  the  same  time  abAcde  =  627365  and  AB7CDE  = 
737542,  which  gives  us  all  of  the  figures  missing  from  the  problem. 
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Kirkman’s  Schoolgirl  Problem 


In  a  boarding  school  there  are  fifteen  schoolgirls  who  always  take  their  daily 
walks  in  rows  of  threes.  How  can  it  be  arranged  so  that  each  schoolgirl  walks  in 
the  same  row  with  every  other  schoolgirl  exactly  once  a  week? 


This  extraordinary  problem  was  posed  in  the  Lady  s  and  Gentleman  s  Diary 
for  1850,  by  the  English  mathematician  T.  P.  Kirkman. 

Of  the  great  number  of  solutions  that  have  been  found  we  will  reproduce  two. 
One  is  by  the  English  minister  Andrew  Frost  (“General  Solution  and  Extension 
of  the  Problem  of  the  15  Schoolgirls,”  Quarterly  Journal  of  Pure  and  Applied 
Mathematics,  vol.  XI,  1871);  the  other  is  that  of  B.  Pierce  (“Cyclic  Solutions  of 
the  School-girl  Puzzle,”  The  Astronomical  Journal,  vol.  VI,  1859-1861). 


Frost’s  Solution.  Mathematically  expressed  the  problem  consists  of 
arranging  the  fifteen  elements  x,  ax  a2,  bh  b2,  ch  c2,  dh  d2,  ex,  e2,f,f2,  gh  g2  in 

seven  columns  of  five  triplets  each  in  such  a  way  that  any  two  selected  elements 
always  occur  in  one  and  only  one  of  the  35  triplets.  As  the  initial  triplets  of  the 
seven  columns  we  shall  select: 

xala2\xblb2\xclc2\xdld2\xele2\xflf2\xglg2. 

Then  we  have  only  to  distribute  the  14  elements  ah  a2,  bx,  b2,  ...,  gl,  g2 
correctly  over  the  other  four  lines  of  our  system. 

Using  the  seven  letters  a,  b,  c,  d,  e,f  g,  we  form  a  group  of  triplets  in  which 
each  pair  of  elements  occurs  exactly  once,  specifically  the  group: 

abc,  ade,  afg,  bdf,  beg,  cdg,  cef.  (The  triplets  are  in  alphabetical  order.) 

From  this  group  it  is  possible  to  take  for  each  column  exactly  four  triplets 
that  contain  all  the  letters  except  those  contained  in  the  first  line  of  the  column. 
If  we  then  place  the  appropriate  triplets  in  alphabetical  order  in  each  column,  we 
obtain  the  following  preliminary  arrangement: 


Sun. 

Mon. 

Tues. 

Wed. 

Thurs. 

Fri. 

Sat. 

xb^b2 

XC\C2 

xdid2 

xeie2 

Xflfi 

Xglg2 

bdf 

ade 

ade 

abc 

abc 

abc 

abc 

beg 

afg 

afg 

afg 

afg 

ade 

ade 

cdg 

cdg 

bdf 

beg 

bdf 

beg 

bdf 

cef 

cef 

beg 

cef 

cdg 

cdg 

cef 

Now  we  have  to  index  the  triplets  bdf,  beg,  cdg,  cef,  ade,  afg,  abc,  i.e.,  to 
provide  them  with  the  index  numbers  1  and  2.  We  index  them  in  the  order  just 
mentioned,  i.e.,  first  all  the  triplets  bdf  then  all  the  triplets  beg,  etc.,  observing 
the  following  rules: 

I.  When  a  letter  in  one  column  has  received  its  index  number,  the  next  time 
that  letter  occurs  in  the  same  column  it  receives  the  other  index  number. 

II.  If  two  letters  of  a  triplet  have  already  been  assigned  index  numbers,  these 
two  index  numbers  must  not  be  used  in  the  same  sequence  for  the  same  letters  in 
other  triplets. 

III.  If  the  index  number  of  a  letter  is  not  determined  by  rules  I.  and  IF,  the 


letter  is  assigned  the  index  number  1. 

The  letters  are  indexed  in  three  steps. 

First  step.  The  triplets  bdf,  beg,  cdg,  and  all  the  letters  aside  from  a  that  can 
be  indexed  in  accordance  with  this  numbering  system  and  rules  I.,  11.,  and  111. 
are  successively  indexed. 

Second  step.  The  missing  index  numbers  (in  boldface  in  the  diagram)  of  the 
triplets  ade  and  afg,  as  well  as  the  index  numbers  obtained  in  accordance  with 
rule  1.  for  the  last  two  a' s  in  line  2  are  assigned. 

Third  step.  The  still  missing  index  numbers  of  the  a’s  in  columns  4  and  5  (in 
the  empty  spaces  of  the  printed  diagram)  are  inserted;  these  are  2  in  line  2  and  1 
in  line  3. 

This  method  results  in  the  following  completed  diagram,  which  represents  the 
solution  of  the  problem. 


Sun. 

Mon. 

Tues. 

Wed. 

Thuri. 

Fri. 

Sat. 

xaxa2 

xbxb2 

XCXC2 

xdxd2 

*«1*2 

xf 1/2 

*glg2 

Mi/i 

a\di*i 

ab2c2 

abic  1 

atb2cx 

atbiC2 

ba*igi 

°tf2g2 

ag/igi 

o/a£i 

<rflg2 

^tda'i 

atd  ,fa 

Cld2g2 

Cidigi 

b\d2J2 

bitig2 

bidifa 

bl*2gl 

b2d2fx 

C  2*2/2 

C2*l/l 

b2e2g2 

*1*2/1 

*2dag\ 

c2d\ga 

* 1*1/2 

Pierce’S  solution  (judged  the  best  by  Sylvester).  Let  one  girl,  whom  we 
will  indicate  as  *,  walk  in  the  middle  of  the  same  row  on  all  days;  we  will  divide 
the  other  girls  into  two  groups  of  7  and  designate  the  first  group  by  the  Arabic 
numbers  1  to  7  or  else  by  lower-case  letters  and  the  second  group  by  the  Roman 
numbers  1  to  Vll  or  else  by  capital  letters.  We  will  let  an  equation  such  as  R  =  s 
indicate  that  the  Roman  number  indicated  by  the  letter  R  possesses  the  same 
numerical  value  as  the  Arabic  numeral  corresponding  to  the  letter  ,s\  Also,  we 
will  designate  the  days  of  the  week  Sunday,  Monday,  ...,  Saturday  by  the 
numerals  0,  1,2,  ...,  6. 

Let  the  Sunday  arrangement  have  the  following  order: 

a  a  A 
b  p  B 
c  y  C 
d  *  D 
E  F  C 


From  this,  by  adding  r  =  R  to  each  numeral,  we  obtain  the  arrangement 

a  +  r  a  +  r  A  +  R 

b  +  r  jS  +  r  B  +  R 

c  +  r  y  +  r  C  +  R 

d  +  r  *  D  +  R 
E  +  R  F  +  R  G  +  R 

for  the  rth  weekday.  Here  every  figure  thus  obtained  that  exceeds  7,  such  as 
perhaps  c  +  r  or  D  +  R,  will  represent  the  girl  who  receives  a  number  (c  +  r  -  7 
or  D  +  R  -  7),  that  is  7  below  the  figure  and  is  subsequently  converted  into  that 
number. 

The  arrangements  thus  obtained  yield  the  solution  of  the  problem  if  the 
following  three  conditions  are  satisfied: 

I.  The  three  differences  a-a,fi-b,y-c  are  1,2,  and  3. 

II.  The  seven  differences  A -a,  A -  a,  B-b,B-/3,C-c,C-y,D-d  form  a 
complete  residue  system  of  incongruent  numbers  to  the  modulus  7  (cf.  No.  19). 

III.  The  three  differences  F-E,G-F,G-E  are  1 ,  2,  3. 

Proof.  We  take  as  a  premise  that  the  following  congruences  (cf.  No.  19)  are 
all  related  to  the  modulus  7. 

1 .  Each  girl  x  of  the  first  group  will  come  together  exactly  once  with  every 
other  girl  v  of  this  group.  The  difference  x-y  is  then  (according  to  I.)  congruent 
to  only  one  of  the  6  differences  a  -  a,  b  -  ft,  c  -  y,  a  -  a,  ft  -  b,  y  -  c.  Let  us 
assume  x-y  =  /3-borx-/3=y-b.  Thus,  if  r  represents  the  number  of  the  day 
of  the  week  that  is  congruent  to  x  -/?  (ory  -  b ),  then 

x  =  +  r  and  y  =  b  +  r, 

so  that  the  girls  x  andy  walk  in  the  same  row  on  weekday  r. 

2.  Each  girl  x  of  the  first  group  comes  together  exactly  once  with  each  girl  X 
of  the  second  group. 

The  difference  X—  x  (according  to  II.)  can  be  congruent  to  only  one  of  the 
seven  differences  A  -  a,  A  -  a,  B  -  b,  B  -  fi,  C  -  c,  C  -  y,  D  -  d.  Let  us  assume  X 
-x  =  C-  y  or  X-C  =  x-y.  If  s  =  S  is  the  weekday  number  that  is  congruent  to 
X-  C  (or  x  -  y),  then  we  have 

X  =  C  +  S  and  x  =  y  +  s, 
so  that  the  girls  X  and  x  walk  in  the  same  row  on  weekday  .s\ 


3.  Each  girl  X  of  the  second  group  comes  together  exactly  once  with  every 
other  girl  Y  of  this  group. 

The  difference  X  —  Y  is  (according  to  111.)  congruent  to  only  one  of  the 
differences  F  -  E,  G  -  F,  G  -  E,  E  -  F,  F  -  G,  E  -  G.  Let  us  assume  that  X-  Y  = 
G-ForX-G  =  Y  -  F.  Then  if  R  represents  the  weekday  number  that  is 
congruent  to  X-  G  (or  Y-  F),  we  obtain 

X  =  G  +  R  and  Y  =  F  +  R, 

so  that  the  girls  X  and  Y  walk  in  the  same  row  on  weekday  R. 

Thus,  we  need  only  satisfy  conditions  1.,  11.,  and  111.  to  obtain  the  Sunday 
arrangement. 

We  choose  a  =  l,  a  =  2,  b  =  3,  consequently  /?  =  5,  and  then  c  =  4,  so  that  y  = 
7  and  d  =  6.  We  then  select  A  =  1,  and  thus  B  =  VI,  C  =  II,  and  D  =  III,  so  that 
the  differences  mentioned  in  condition  II.  are  the  numbers  0,-1,  3,  1,  -2,  -5, 
which  are  incongruent  to  the  modulus  7.  The  numbers  IV,  V,  and  VII  then  remain 
for  the  letters  E,  F,  G. 

The  Sunday  arrangement  is  therefore 

1  2  I 

3  5  VI 

4  7  II 

6  *  III 

IV  V  VII 

The  weekday  rows,  in  order,  are  arranged  in  the  following  manner: 


2 

3 

II 

3 

4 

III 

4 

5 

IV 

4 

6 

VII 

5 

7 

I 

6 

1 

II 

5 

1 

III 

6 

2 

IV 

7 

3 

V 

7 

* 

IV 

1 

♦ 

V 

2 

* 

VI 

V 

VI 

I, 

VI 

VII 

II, 

VII 

I 

III, 

5 

6 

V 

6 

7 

VI 

7 

1 

VII 

7 

2 

III 

1 

3 

IV 

2 

4 

V 

1 

4 

VI 

2 

5 

VII 

3 

6 

I 

3 

* 

VII 

4 

* 

I 

5 

* 

II 

I 

II 

IV, 

II 

III 

V, 

III 

IV 

VI. 
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The  Bernoulli-Euler  Problem  of  the  Misaddressed  Letters 


To  determine  the  number  of  permutations  of  n  elements  in  which  no  element 
occupies  its  natural  place. 

This  problem  was  first  considered  by  Niclaus  Bernoulli  (1687-1759),  the 
nephew  of  the  two  great  mathematicians  Jacob  and  Johann  Bernoulli.  Later 
Euler  became  interested  in  the  problem,  which  he  called  a  quaestio  curiosa  ex 
doctrina  combinations  (a  curious  problem  of  combination  theory),  and  he 
solved  it  independently  of  Bernoulli. 

The  problem  can  be  stated  in  a  somewhat  more  concrete  form  as  the  problem 
of  the  misaddressed  letters: 

Someone  writes  n  letters  and  writes  the  corresponding  addresses  on  n 
envelopes.  How  many  different  ways  are  there  of  placing  all  the  letters  in  the 
wrong  envelopes? 

This  problem  is  particularly  interesting  because  of  its  ingenious  solution. 

Let  the  letters  be  known  as  a,  b,  c,  ...  the  corresponding  envelopes  as  A,  B,  C, 
....  Let  the  number  of  misplacements,  which  we  are  seeking,  be  designated  as  h. 

Let  us  first  consider  all  the  cases  in  which  a  finds  its  way  into  B  and  b  into  A 
as  one  group,  and  all  the  cases  in  which  a  gets  into  B  and  b  does  not  get  into  A  as 
a  second  group. 

The  first  group  obviously  includes  n  -  2  cases. 

The  number  of  cases  falling  into  the  second  group  can  be  determined  if 
instead  of  b,  c,d,e,  ...  and  A,  C,D,E,  ...  we  write,  say,  b',c',d',e',  ...  and  B',  C 
',D',E',  ....  Accordingly,  the  number  is  „  _  \. 

The  number  of  all  the  cases  in  which  a  ends  up  in  B  is  then  ,7T7T  +  n  -  2- 
Since  each  operation  of  placing  “ a  in  C,”  “a  in  D”  ...  yields  an  equal  number  of 
cases,  the  total  number  h  of  all  the  possible  cases  is 


n  -  (n  -  l)[n  -  1  +  n  -  2]. 
We  write  this  recurrence  formula 


n  —  nn  —  1  =  i[n  —  1  —  (n  —  1)  •«  —  2], 


in  which  i  represents  -  1  and  apply  it  to  the  letter  numbers  3,  4,  5,  ...  up  to  n. 
Thus,  we  obtain 


3- 32  =  i[2  -  2  I], 

4- 43  =  .[3-3-5], 


n  —  nn  —  1  =  i[n  —  1  —  (n  —  \)n  —  2]. 

By  multiplying  these  (. n  -  2)  equations  we  obtain 

n  -  n.n  _  l  =  t»-2[2  -  2-1], 

or,  since  I  =  0,  2  =  1,  and  in  ~  2  =  in, 

n  —  nn  —  l  =  in. 

We  then  divide  this  equation  by  n\,  which  gives 

h  n  —  1  in 

«!  "  (n  -  1)!  =  «! 

If  we  replace  n  in  this  formula  by  the  series  2,  3,  4,  we  obtain 

1  _  1  _  i! 

2!  1!  “  2!’ 

3_  _  l  £ 

3!  2!  3!’ 

n  n  —  1  _  t" 

n !  (n  —  1 ) !  n ! 

Addition  of  these  (n-  1)  equations  results  (since  I  =  0)  in 

n  i 2  t3  in 

—  = - 1 - -u  ...  -| - - 

n\  2!  3!  ^  ^  n\ 

From  this  we  are  finally  able  to  obtain  the  desired  number  n  \ 

,/ 1  1  1  tn\ 

"  =  ”’l2!-3l  +  4l-+  +^f 


If  §  represents  a  symbol  such  that  the  application  of  the  binomial  theorem  (cf. 
No.  9)  to  (s  -  1)”  allows  v!  to  be  written  for  each  power  ef  of  the  binomial 
expansion,  the  number  can  be  expressed  in  the  simpler  form 

h  =  (3  -  1)\ 

For  a  value  such  as  n  =  4,  for  example,  we  obtain  4  =  (§  -  l)4  = 


g4  -  4$3  +  6$2  -  4$  +  1  =  4!  -  4-3!  +  6-2!  -  4- 1 !  +  1  =9, 
which  is  easily  checked  by  testing. 

Similarly,  the  number  of  permutations  that  can  be  formed  from  n  elements  in 
which  no  element  is  in  its  natural  place  is  (§  -  1)”. 

For  the  four  elements  1,  2,  3,  4,  for  example,  there  are  the  nine  permutations 
2143,  2341,  2413,  3142,  3412,  3421,  4123,  4312,  4321. 

Note.  The  result  obtained  also  contains  the  solution  of  the  determinant 
problem: 

In  how  many  constituents  of  an  n-degree  determinant  do  no  principal 
diagonal  elements  occur? 

This  is  immediately  seen  if  the  rth  element  of  the  sth  column  is  called  csr. 
The  elements  of  the  principal  diagonal  are  then 

*%2  -3  -71 

•  •  •  1 

The  determinant  therefore  contains  (3  -  l)'7  constituents  outside  the  principal 
diagonal  elements. 
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Euler’s  Problem  of  Polygon  Division 


In  how  many  ways  can  a  {plane  convex )  polygon  of  n  sides  be  divided  into 
triangles  by  diagonals? 


Leonhard  Euler  posed  this  problem  in  1751  to  the  mathematician  Christian 
Goldbach.  For  the  number  to  be  found,  En,  the  number  of  possible  divisions, 
Euler  developed  the  formula: 


2-6*10...  (4n  -  10) 

(»-  1)! 


(1) 


This  problem  is  of  the  greatest  interest  because  it  involves  many  difficulties 
in  spite  of  its  innocuous  appearance,  as  many  a  surprised  reader  will  discover  if 
he  attempts  to  derive  the  Euler  formula  without  outside  assistance.  Euler  himself 
said,  “The  process  of  induction  1  employed  was  quite  laborious.” 

In  the  simplest  cases  n  =  3,  4,  5,  6  the  various  divisions 

E3  =1,  E<  =  2,  E 5  =  5,  Ee  =  14 

are  easily  obtained  from  the  graphic  representations.  But  this  method  soon 
becomes  impossible  as  the  number  of  angles  is  increased. 

In  1758  Segner,  to  whom  Euler  had  communicated  the  first  seven  division 
numbers  1,  2,  5,  14,  42,  132,  429,  established  a  recurrence  formula  for  En  ( Novi 

Commentarii  Academiae  Petropolitanae  pro  annis  1758  et  1759,  vol.  Vll)  which 
we  will  begin  by  deriving. 

Let  the  angles  of  any  convex  polygon  of  n  angles  be  1,  2,  3,  ...,«.  For  every 
possible  division  En  of  the  polygon  of  n  angles  we  may  take  the  side  n\  as  the 

base  line  of  a  triangle  the  apex  of  which  is  situated  at  one  of  the  angles  2,  3,  4, 
1  in  accordance  with  the  division  selected.  If  the  apex  is,  for  example, 
situated  at  angle  r,  on  one  side  of  the  triangle  nlr  there  is  a  polygon  of  r  angles 
and  on  the  other  a  polygon  of  s  angles,  r  +  s  being  equal  to  n  +  1  (since  the  apex 
r  belongs  to  both  the  polygon  of  r  angles  and  the  polygon  of  s  angles). 

Since  the  polygon  of  r  angles  (or  r-gon)  permits  Er  divisions  and  the  r-gon 
permits  Es  divisions,  and  since  each  division  of  the  r-gon  can  be  connected  with 
every  division  of  the  s-gon  toward  a  division  of  the  given  n-gon,  the  mere  choice 
of  the  apex  r  results  in  Er  ■  Es  different  divisions  of  the  given  n- gon. 

Since,  then,  r  can  possess  successively  every  value  of  the  series  2,  3,. . .,  n  -  1 
and  ^  can  accordingly  possess  successively  every  value  of  the  series  n  -  1,  n  -  2, 
...,  3,  2,  it  follows  that 

(2)  En  =  E2En.l  +  E3En_2  +  •  •  •  +  EH.XE2> 

where  the  factor  E2,  which  is  merely  added  for  better  appearance,  has  the  value 
1. 

Formula  (2)  is  Segner ’s  recurrence  formula.  It  confirms  the  previously  given 
values  for  E3  to  E6  as  well  as  giving 


£,  =  £a£,  +  £3£,  +  E,Et  +  £s£3  +  £6£2  =  42, 


E,  =  £,£,  4-  £,£«  4-  £. 


i  T?  r 


etc. 

As  the  index  number  is  increased  Segner’s  formula,  in  contrast  with  Euler’s, 
grows  more  and  more  unwieldy,  as  Goldbach  has  already  indicated. 

We  can  obtain  the  Euler  formula  (1)  most  simply  if  we  consider  Euler’s 
division  problem  or  Segner’s  recurrence  formula  in  the  light  of  an  idea  of 
Rodrigues  {Journal  de  Mathematiques,  3  [1838])  and  connect  it  with  a  problem 
treated  by  the  French  mathematician  Catalan  in  the  year  1838  in  the  Journal  de 
Mathematiques. 

Catalan’s  problem  has  the  form: 

How  many  different  ways  can  a  product  of  n  different  factors  be  calculated  by 
pairs? 

We  say  that  a  product  is  calculated  by  pairs  when  it  is  always  only  two 
factors  that  are  multiplied  together  and  when  the  product  arising  from  such  a 
“paired”  multiplication  is  used  as  one  factor  in  the  continuation  of  the 
calculation.  Calculation  by  pairs  of  the  product  3  •  4  •  5  •  7,  for  example,  is 
carried  out  in  the  following  manner:  3  •  5  =  15,  4  •  15  =  60,  7  •  60  =  420.  For  the 
four-membered  product  abed  an  alphabetical  arrangement  of  the  factors  gives 
the  following  five  paired  multiplications: 

[{ab)-c]-d,  [a  -  (b  c)]-d,  {a-b)-{c-d),  a[(b-c)-d],  a-[b-(cd)]. 

A  product  in  which  the  paired  multiplications  that  are  to  be  carried  out  are 
marked  by  brackets  or  the  like  will  be  referred  to  in  abbreviated  form  as 
“paired.” 

{[(a  '  b)  ■  C]  ■  [{d  ■  e)  ■  (f-  g)]}  •  {{h  •  i)  •  k }  is  therefore  a  paired  product  of 
the  ten  factors  a  to  k.  It  is  immediately  seen  that  a  paired  product  of  n  factors 
contains  (n  -  1)  multiplication  signs  and  correspondingly  involves  (n  -  1)  paired 
multiplications  (for  every  two  factors). 

Catalan’s  problem  requires  the  answers  to  two  questions: 

1 .  How  many  paired  products  of  n  different  prescribed  factors  are  there? 

2.  How  many  paired  products  can  be  formed  from  n  factors  if  the  sequence  oj 
the  factors  (e.g.,  an  alphabetical  sequence)  is  prescribed? 

The  first  number  we  will  designate  as  Rn  and  the  second  as  Cn. 


The  simplest  method  of  obtaining  Rn  (according  to  Rodrigues)  is  by  means  of 
a  recurrence  formula.  We  will  imagine  the  Rn  /z-membered  paired  products  to  be 
formed  of  the  n  given  factors  f\,f2,  we  will  add  to  this  an  (n  +  l)th  factor 

fn  +  1  =/ and  form  from  the  available  Rn  /z-membered  products  all  the  Rn  +  ,  (n  + 

1) -membered  products  of  the  factors  f\,f2,  + 

Now  each  of  the  Rn  /z-membered  products  P  includes  (n  -  1)  paired 
multiplications  of  the  form  A  ■  B.  If  we  use  /  once  as  the  multiplier  in  front  of  A, 
once  as  the  multiplicand  after  A,  once  as  the  multiplier  before  B  and  once  as  the 
multiplicand  after  B ,  we  thereby  obtain  from  A  ■  B  four  new  paired  products  if- 
A)  •  (B),  (A  -f)  •  (B),  (A)  ■  if-  B\  and  (A)  •  (B  ■  f). 

Since  these  four  arrangements  of  the  factor  / can  be  effected  for  each  of  the  n 
-  1  paired  subproducts  of  P,  we  obtain  from  P  4(/z  -  1)  (n  +  l)-membered  paired 
products.  Moreover,  we  also  obtain  from  P  the  two  ( n  +  l)-membered  paired 
products  /  •  P  and  P  •  f  The  described  arrangement  of  the  factors  /  thus  yields 
from  only  one  (P)  of  the  Rn  /z-membered  products  (4/2  -  2)  (n  +  l)-membered 
products.  From  all  Rn  /z-membered  paired  products  we  therefore  obtain  Rn  ■  (4 n  - 

2)  (/z  +  l)-membered  paired  products.  The  sought-for  recurrence  formula 
accordingly  reads 

(3)  Rn+i  =  (4 n  -  2 )Rn. 

To  obtain  an  independent  representation  of  Rn  we  begin  with  R2  =  2  (two 
factors  a  and  b  yield  only  two  products:  a  •  b  and  b  •  a)  and  we  infer  from  (3)  R3 
=  6 R2  =  2  ■  6,  R4  =  10  R3  =  2  ■  6  •  10,  R5  =  14  R4  =  2  ■  6  •  10  •  14,  etc.,  and  finally 

(4)  Rn  =  2-6- 10- 14...  (4n  -  6). 


The  second  question  can  also  be  answered  by  returning  to  a  recurrence 
formula. 

Let  the  n  factors/,  in  the  prescribed  order  be  (ph  cp2,  (pn.  We  will  take  from 
the  Cn  paired  w-membered  products  belonging  to  this  series  those  having  the 
form 


(  )•(  )> 


where  the  parenthesis  on  the  left  includes  the  r  members  cp\,  (p2,  ...,  cpn  and  the 
one  on  the  right  the  s  =  n  -  r  members  (pr  +  x  +  (pr  +  2,  •  •  •  +  (pr  +  s  =  (pn.  Since  the 


left  parenthesis,  in  accordance  with  its  r  members,  can  possess  Cr  different  forms 
and  the  right  correspondingly  can  possess  Cs  different  forms,  while  each  form 
belonging  to  the  left  parenthesis  can  combine  with  each  form  included  in  the 
right  parenthesis,  the  above  main  form  yields  Cr  •  Cs  different  n-membered 

paired  products. 

Since,  moreover,  r  can  have  every  value  from  1  to  n  -  1,  it  follows  that 

(5)  Cn  =  CiC,,-!  +  C2Ch.  2  +  •  •  •  +  Cj.iCj. 

By  using  this  recurrence  formula  and  beginning  from  C,  =  1  and  C2  =  1,  we 
obtain  the  following  sequence 

C3  =  C\C2  +  C2Ci  —  2, 

C4  =  CjC 3  +  C2C2  +  C3C1  =  5, 

C5  =  ClCi  +  C2C3  +  C3C2  +  C\CX  =  14, 


etc. 

To  obtain  an  independent  representation  of  Cn  we  can  imagine  that  there  are 
n\  different  sequences  (permutations)  of  the  factors  fm  that  each  of 

these  sequences  possesses  Cn  paired  «-membered  products  and  that  all  the 
sequences  together  possess  Rn  such  products.  Then  Rn  =  Cn  •  n\  or 

r  -  2-6- 10...  (4n  -6) 

W  Cn  n\  n\ 

Formulas  (4)  and  (6)  solve  Catalan’s  problem 

Now  for  Euler’s  formula! 

From  the  indicated  values 


and  formulas  (2)  and  (5)  it  immediately  follows  that  in  general 
(7)  En  =  Cn.v 

[The  proof  is  by  induction.  We  assume  that  (7)  is  true  for  all  indices  through  n, 
so  that  E2  =  Cj,  E3  =  C2,  ...  En  =  Cn 


According  to  (2)  and  (5) 


£n  +  a  =  £2£„  +  EaEn.1  +  •  •  •  4-  EnE2, 

Cn  =  CiCfi-l  4*  C2Cn^2  4*  *  *  *  4"  Cn-l^l’ 

Since  the  right  sides  of  the  two  last  equations  correspond  member  for  member,  it 
also  follows  that 


£n  ♦  1  —  Oil 


i.  e.,  formula  (7)  is  valid  for  every  index.] 

(6)  and  (7)  give  us  Euler’s  formula  immediately: 


(8) 


2-6-10...  (4n  -  10) 
(»-  1)! 


In  conclusion  we  would  like  to  give  a  slight  simplification  of  Euler’s  formula. 
It  is 

2"-*-l-3-5...(2n  -  5)  2n_a(2«  —  3)! 

*  (n  —  1) !  “  (n  -  l)!2B-a.(«-2)!(2n-3)’ 

and  consequently 


En  =  kflk, 

where  f=n-  2  is  the  number  of  triangles  into  which  the  n-gon  can  always  be 
divided  and  k  =  2n-3  is  the  number  of  sides  bounding  these  triangles. 

Recently  ( Zeitschrift  fur  math,  und  naturw.  Unterricht,  1941,  vol.  4)  H. 
Urban  derived  Euler’s  formula  in  the  following  manner. 

He  first  calculated  E5,  E6,  E-,  by  means  of  the  Segner  recurrence  formula  and 
“inferred”  the  following: 

E2  =  1,  •  Eq  =  1,  £4  =  2,  Es  =  5,  Ee  =  14,  E7  =  42, 

£3  =  2  £4  =  6  Es  ]0  E^  _  \ 4  £7  _  18 

E2  ~  2’  £3  “  3’  £4  4’  £5  5 *  Ee  ~  6’ 

on  the  strength  of  which  he  surmised  that  En  would  have  to  be 


(Unfortunately,  he  does  not  say  whether  it  was  Euler’s  recurrence  formula  or 
some  other  idea  that  led  him  to  his  “inference.”) 

This  recurrence  formula  is  certainly  correct  for  the  first  values  of  the  index  n. 
To  prove  its  general  validity  the  conclusion  for  n  is  applied  to  n  +  1:  it  is 
assumed  that  the  recurrence  formula  (I)  is  true  for  all  index  numbers  from  1  to  n 
-  1  and  it  is  demonstrated  that  it  is  therefore  also  true  for  n. 

The  proof  is  carried  out  by  means  of  the  expression 

(II)  S  =  1  •E2En_l  +  2‘E3En_2  +  3*£4-2sn_3  +  •  •  • 

+  (n  -  2 )-En_1-E2 


or,  written  in  the  reverse  order, 

(III)  S=(n-  2 )En_l-E2  +  (n  -  3 )-£n_2-£3 

+  (n  —  4)  •  jEn_3-£4  +  •  •  •  +  1  •  E2’  £n_x« 

Columnar  addition  of  these  two  equations  gives 

2  S  =  (n  —  I )  -  x  +  E*En_2  +  +  En.1E2 ] 

or,  since  in  accordance  with  Segner’s  recurrence  formula  the  value  of  the 
expression  within  the  brackets  is  equal  to  En, 

(IV)  2S  =  {n  -  1  )En. 

Now  the  left-hand  factor  Er  in  each  product  Er  ■  Es  of  (II)  and  (III)  (except  the 

case  in  which  r  =  2)  is  replaced  in  accordance  with  the  recurrence  formula  (I)  by 
lr  _  x  Er  _  I /(r  -  1)  with  Xv  =  4V  -  6.  This  gives  us 

(II')  S  =  E2En_1  +  A  2E2En_2  +  A  2E3En_3  +  •  •  • 

4*  2^bi-2^2> 

(III')  S=  Aw_a£n_2£2  +  An_3£,n_3£3  +  •  •  • 

+  A  2£2£n_2  +  E2En.  x 

and  by  columnar  addition  of  these  two  lines,  since  Xv  +  ln  _  v  =  4 n  -  12,  we 
obtain 


2S  =  £n_i  4-  (4/i  —  12)  [£2 £«-a  +  E3En„3  +  •  •  •  +  ^n-a^a] 4-  •£„_! 

or,  since  the  expression  within  brackets  is  equal  to  En_  b 
(V)  2S  «  (4n  —  10)^-!. 

Equations  (IV)  and  (V)  give  us 


so  that  Euler’s  recurrence  formula  (I)  is  thereby  shown  to  be  valid  for  the  index 
number  n,  also,  and  thus  generally  valid. 
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Lucas’  Problem  of  the  Married  Couples 


How  many  ways  can  n  married  couples  be  seated  about  a  round  table  in  such 
a  manner  that  there  is  always  one  man  between  two  women  and  none  of  the  men 
is  ever  next  to  his  own  wife? 


This  problem  appeared  (probably  for  the  first  time)  in  1891  in  the  Theorie  des 
Nombres  of  the  French  mathematician  Edouard  Lucas  (1842-1891),  author  of 
the  famous  work  Recreations  mathematiques .  The  English  mathematician  Rouse 
Ball  has  said  of  this  problem,  “The  solution  is  far  from  easy.” 

The  problem  has  been  solved  by  the  Frenchmen  M.  Laisant  and  M.  C. 
Moreau  and  by  the  Englishman  H.  M.  Taylor.  A  solution  based  upon  modern 
viewpoints  is  to  be  found  in  MacMahon’s  Combinatory  Analysis.  The  approach 
adopted  here  is  essentially  that  of  Taylor  ( The  Messenger  of  Mathematics,  32, 
1903). 

We  will  number  the  series  of  circularly  arranged  chairs  from  1  through  2 n. 
The  wives  will  then  all  have  to  be  seated  on  the  even-  or  odd-numbered  chairs. 
In  each  of  these  two  cases  there  are  n\  different  possible  seating  arrangements, 
so  that  there  are  2  •  n\  different  possible  seating  arrangements  for  the  women 
alone. 

We  will  assume  that  the  women  have  been  seated  in  one  of  these 
arrangements  and  we  will  maintain  this  seating  arrangement  throughout  the 
following.  The  nucleus  of  the  problem  then  consists  of  determining  the  number 
of  possible  ways  of  seating  the  men  between  the  women. 

Let  us  designate  the  women  in  the  assumed  seating  sequence  as  Fh  F2,  . .., 


Fn,  their  respective  husbands  Mx,  M2,  . . .,  Mn,  the  couples  (Fx,  Mx),  (F2,  M2),  ..., 
as  1,  2,  ...  and  arrangements  in  which  there  are  n  married  couples  as  /7-pair 
arrangements.  Let  us  designate  the  husbands  about  whom  we  have  no  further 
information  as  Xx,  X2,  .... 

Let 


FlXlF2X2...FnXttFn+lX„l 

be  an  (n  +  l)-pair  arrangement  in  which  none  of  the  husbands  sits  beside  his  own 
wife.  (It  must  be  remembered  that  the  arrangement  is  circular,  so  that  Xn  +  x  is 
seated  between  Fn  +  x  and  Fx.)  If  we  take  Fn  +  x  and  Mn  +  1  =  Xv  out  of  the 
arrangement  and  replace  Xv  with  Xn  +  ,  =  MfV  we  obtain  the  /7-pair  arrangement 

F1X1F2X2...FvMltF„l...FnXn. 

This  arrangement  can  occur  in  three  ways: 

1.  No  man  sits  next  to  his  wife 

(thus  ^  Mv,  M,,  ^  Afv  +  1,  Xn  ^  Mt). 

2.  One  man  sits  next  to  his  own  wife  (namely  when 

Mu  =  Mv  or  Mu  =  A/v  +  1  or  else  Xn  =  M{). 

3.  Two  men  sit  next  to  their  own  wives  (when  Mfl  =  Mv  or  Mn  =  Mv+l  and  at 
the  same  time  Xn  =  Ml  that  is,  when  in  our  arrangement  the  order  MXFX  occurs). 

Thus,  we  must  consider  other  seating  arrangements  in  addition  to  the  one 
prescribed  in  the  problem. 

In  the  following  we  will  distinguish  between  three  types  of  arrangements: 
arrangements  A,  B,  and  C.  An  d -arrangement  will  be  one  in  which  no  man  sits 
next  to  his  wife.  A  F-arrangement  will  be  one  in  which  a  certain  man  sits  on  a 
certain  side  of  his  wife.  Finally,  a  C-arrangement  will  be  one  in  which  a  certain 
man  sits  on  a  certain  side  of  his  wife  and  another  man — which  one,  is  not 
prescribed — sits  alongside  his  wife — but  the  side  is  likewise  not  prescribed. 

We  will  designate  the  number  of  //-pair  A-,  B-,  C-arrangements  as  An,  Bn,  Cn, 
respectively. 

First  we  will  try  to  determine  the  relationships  among  the  six  magnitudes  An, 
Bn,  Cn,  An  +  |,  Bn  +  |,  Cn  +  x;  we  will  begin  with  the  simplest  of  these 


relationships. 

Let  us  consider  Bn  +  1  ^-arrangements 


FlX1F2X2  . . .  FnXnFn  + 1  Mn  + 1 

of  the  pairs  1,  2,  ...,(«  +  1),  in  which  Mn  +  x  sits  next  to  Fn  +  x  on  her  right.  We 
will  divide  the  arrangements  into  two  groups  in  accordance  with  whether  Xn  = 
Mx  or  Xn  ±  Mx.  We  then  remove  the  pair  Fn  +  x  Mn  +  x  from  all  of  them.  The  first 
group  then  gives  us  all  Bn  //-pair  d -arrangements,  and  the  second  all  An  n- pair  d- 
arrangements,  so  that 

(1)  -#n  +  l  =  +  dn. 

We  can  obtain  a  second  relationship  by  considering  the  Cn  +  x  (n  +  l)-pair  C- 
arrangements 

M1F1Z1F2^2...Fn2rnFn  +  1> 

in  which  one  of  the  men  Xx,  X2,  . . . ,  Xn  sits  next  to  his  own  wife.  We  also  divide 
these  arrangements  into  two  groups  in  accordance  with  whether  or  not  Xx  is  or  is 
not  equal  to  Mn+l. 

The  second  group  then  contains  (2 n  -  1)  subgroups.  In  the  first,  M2  is  seated 
on  the  left  of  F2,  in  the  second  on  her  right;  in  the  third,  M3  sits  on  the  left  of  F3, 
in  the  fourth  on  her  right,  etc.;  in  the  (2 n  -  l)th,  Mn  +  ,  is  seated  on  the  left  of  Fn 

+ 1- 

If  we  leave  the  pair  MXFX  out  of  all  of  the  Cn  +  ,  C-arrangements,  we  obtain 
from  the  first  group  all  Cn  C-arrangements  of  the  pairs  2,  3,  4,  ...,  (n  +  1)  in 
which  Mn  +  |  is  seated  on  the  right  of  Fn  +  x,  and  from  each  subgroup  of  the 
second  group  we  obtain  Bn  ^-arrangements  of  the  pairs  2,  3,  ...,(«  +  1),  so  that 

(2)  Cw+1  =  Cn  +  (2n  -  l)Bn. 

As  we  found  above,  if  we  remove  the  pair  Fn  +  l,  Mn  +  l  from  an  (n  +  l)-pair 
d -arrangement  FxXxF2X2  . . .  Fn  +  xXn  +  x  and  replace  the  Mn  +  ,  that  has  been 
removed  with  Xn  +  x,  the  arrangement  is  transformed  into  an  /7-pair  d-,  B-,  or  C- 
arrangement. 


Conversely,  we  obtain  an  A -arrangement  of  the  ( n  +  1)  pairs  1,  2,  ...,(«  +  1) 
when  we  insert  Fn  +  xMn  +  x  before  Fx  of  an  A-,  B-,  or  C-arrangement  of  the  n 
pairs  1,2,  n  and  then  exchange  the  places  of  Mn  +  x  and  some  other  man  (in 
such  a  manner  that  none  of  the  men  is  seated  next  to  his  own  wife  after  the 
exchange  of  places).  It  is  also  clear  that  this  method  gives  us  all  the  A- 
arrangements  of  the  (n  +  1)  pairs  1,2,  ...,(«  +  1). 

In  order  to  find  An  +  x  it  is  therefore  only  necessary  to  determine  the  number 
of  ways  in  which  this  insertion  and  the  subsequent  exchange  can  be 
accomplished  for  all  possible  A-,  B-,  and  C-arrangements  of  the  n  pairs  1 
through  n. 

We  accomplish  the  described  formation  of  the  ( n  +  l)-pair  ^-arrangements  in 
three  steps. 

I.  Formation  from  A-arrangements . 

After  the  insertion: 


FiXtFaXi . .  ,FnXnFn^.1Mu^l 

we  can  exchange  the  places  of  Mn  +  x  and  any  other  man  except  Xn  and  Mx,  so 
that  from  each  of  the  An  n- pair  A-arrangements  we  obtain  (n  -  2)  (n  +  l)-pair  A- 
arrangements.  Consequently,  we  obtain  a  total  of 

( n  —  2)An  (n  +  1) -pair  A-arrangements. 

II.  Formation  from  B-arrangements. 

The  /z-pair  ^-arrangements  exhibit  the  following  2 n  forms: 

1.  . . .  FlMl . . . 

2.  ...  FxM2F2  . . . 

3.  ...F^/vVfa..., 

(2n  —  2). 

(2n  —  1).  •  •  FnMnFl . . ., 

2 n.  ...  FnMxFx .... 

And  there  are  Bn  of  each  of  these  forms. 

Our  process  of  formation  is  not  applicable  to  the  first  and  the  (2 n  -  1)  th  of 
these  forms  (since  the  inserted  Mn  +  ,  would  have  to  be  exchanged  with  Mx  or 
Mn,  as  a  result  of  which,  however,  Mx  would  end  up  on  the  left  side  of  Fx,  or  Mn  + 


i  would  be  on  the  left  side  of  Fn  + 1). 

In  the  second,  third,  . . .,  (2 n  -  2)th  form,  the  exchange  of  the  inserted  Mn  +  x 
with  M2,  M2,  M3,  M3,  . . .,  Mn  _  b  Mn  _  j,  Mn  transforms  the  /7-pair  ^-arrangements 
into  (n  +  l)-pair  ^-arrangements,  as  a  result  of  which  a  total  of 

(2 «  —  3 )Bn  ( n  +  l)-pair  d-arrangements 


are  obtained. 

In  the  (2«)th  form,  the  inserted  Mn  +  x  can  be  exchanged  with  any  of  the  men 
M2,  M3,  . . .,  Mn,  as  a  result  of  which  a  total  of 

(n  —  I)/?*  (n  +  l)-paird-arrangemcnts 


are  obtained. 

III.  Formation  from  C-arrangements . 

Our  method  transforms  any  one  of  the  Cn  /7-pair  C-arrangements: 


MlF1X2F2X3F3 . . .  XnFn 

into  an  {n  +  l)-pair  ^-arrangement  if  we  switch  the  places  of  Mn  +  1  and  the  man 
Mv  seated  next  to  his  wife  (v  being  one  of  the  values  2,  3,  4,  ...,  n).  In  this 
manner  we  obtain  from  every  //-pair  C-arrangement  an  (/?  +  l)-pair  A- 
arrangement,  which  corresponds  to  a  total  of 

C„  (n  +  l)-pair  ^-arrangements. 

Thus,  the  methods  of  formation  described  under  L,  II.,  and  III.  give  us  all  of  the 
(, n  +  l)-pair  ^-arrangements,  or  a  total  of 

[(»  ~  2 )An  +  (3 n  -  4)Bn  +  Cn], 


arrangements,  so  that 

(3)  An +l  =  (»  -  2)An  +  (3 n  -  4 )Bn  +  Cn. 

In  order  to  obtain  formulas  in  which  only  the  same  capital  letters  occur,  we 
infer  from  (1) 


An  =  Bn  +  1-BH  and  dn  +  1  =  5B  +  a  -  £n  +  1 


and  introduce  these  values  into  (3).  This  gives 

#n  +  a  =  (n  -  4-  (2 n  -  2 )Bn  4-  Cn. 

If  we  then  replace  n  by  n  +  1 ,  it  follows  that 

^n  +  3  =  n^n  +  2  +  2 tlBn  +  i  4-  + 

If  we  subtract  the  next  to  the  last  equation  from  the  last  one  and  take  (2)  into 
consideration,  we  get 


Bn  +  3  a  (»+  l)[^n  +  2  +  /?„♦!]  4-  Bn 
or,  if  we  replace  n+  1  here  by  n, 

(4)  Bn  +  a  =  n(Bn  +  i  4-  Bn)  4-  Bn.v 

This  simple  recurrence  formula  for  the  B’s  enables  us  to  calculate  from  three 
successive  B’s  the  B  that  follows  immediately. 

It  is  also  possible  to  derive  a  recurrence  formula  in  which  only  three 
successive  B’ s  are  connected,  i.e.,  a  formula  having  the  form 

(5)  enBn tl  +  fnBn  4-  gnBn - 1  —  cf 

in  which  the  coefficients  en,  fn,  gn  represent  known  functions  of  n  and  c  is  a 
constant. 

In  order  to  find  it  we  replace  n  in  (5)  with  (n  +  1)  and  obtain 

e*  +  %B*  +  2  +  f*  +  \Bn  +  i  4"  g*  +  iBn  =  c. 

Subtraction  of  this  equation  from  (5)  gives 

-*n  +  lBn  +  2  4-  (**  +  4-  (fn  -  gn  +  l)Bn  +  gnB„-i  =  0. 

In  order  to  find  the  equations  of  condition  for  the  coefficients  e,f  g  which  are 
still  unknown,  we  compare  the  formula  obtained  with  equation  (4)  after  equation 
(4)  has  been  multiplied  by  gn: 


Srfin*2  +  ngnBn  +  y  "t"  ngnBn  ~t"  Sn^n-l 

Thus,  we  are  able  to  obtain  e,fg  and  satisfy  the  three  conditions 

(I)  *w-H  “  gn>  (11)  en  —  In*  l  =  ngK,  (HI)  Sn  gn  *  1  =  ngn, 

giving  us  the  sought-for  recurrence  formula. 

From  (III)  it  follows  that 

/*  =  £»♦  1  +  ngn  <>r  />41  -  gn  +  2  +  ("  +  l)fc«.lt 
and  from  (II)  and  (I) 


/•♦I  =  <n  ~  -  gn-l  ~  f»gn- 

By  equating  the  two  values  obtained  for  fn  +  lwo  get 

(n  +  1)£«*1  +  ngn  -  gn-l  -  gn  +  2- 

It  is  easily  seen  that 


gn  -  «*"  (*  =  “I) 

is  a  solution  of  this  equation.  This,  according  to  (I),  yields 

« n  =  gn-l  "  “(«  “  1)»* 

and,  according  to  (III), 


fn  =  gn  +  i  +  ngn  =  i"(n2  -  n  -  1). 

Equation  (5)  is  thereby  transformed  into 

( n  -  l)Bn  +  i  -  (n3  -  n  -  1  )Bn  -  nBH_1  =  -ft". 

In  order  to  determine  the  constant  c,  we  set  n  equal  to  4,  we  observe  that  B3  = 
0,  Ba  =  1,  and  Bs  =  3,  and  we  obtain  c  =  2. 

The  sought-for  recurrence  formula  consequently  reads 


(6)  (n  -  l)£n  +  1  =  (n2  -  n  -  1  )Bn  +  nB -  2i\ 

In  order  to  obtain  a  recurrence  formula  for  the  A’s  as  well,  we  express  An  _  h 
An,  and  An  +  l,  in  accordance  with  (1)  and  (6),  by  Bn  and  Bn  +  l.  Thus  we  obtain 


1  —  n  D  n2  —  1  D  2tn 

An- 1  =  +  — n — B" - ~n' 

A*  =  +  (  —  BnJ 

An + j  =  ”a-~  -1  5n+J  +  5  + 

«  n  n 


and  from  this  by  elimination  of  Bn  and  Bn  +  ,  we  obtain 


(7)  (i.  -  1)^..,  -  ( r 2  -  l)^m  +  (n  +  I )An.x  +  4,-. 

This  is  Laisant’s  recurrence  formula.  It  makes  possible  the  calculation  of 
each  A  from  the  two  immediately  preceding  A’s. 

Thus,  from  A3  =  1,  A4  =  2,  and  (7),  it  follows  that  A5  =  13,  which  is  still  easy 
to  check  directly.  Moreover,  the  whole  series  A6  =  80,  A7  =  579,  A8  =  4738,  A9  = 
43387,  A10  =  439792,  An  =  4890741,  An  =  59216642,  etc.  can  then  be  derived 
from  (7).  The  difficult  point  in  the  calculation  of  A  can  therefore  be  considered 
as  eliminated. 

The  problem  is  solved. 

The  number  of  possible  seating  arrangements  of  n  married  couples  is  2An  • 
n!,  in  which  An  can  be  calculated  from  Laisant’s  recurrence  formula. 
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Omar  Khayyam’s  Binomial  Expansion 


To  obtain  the  nth  power  of  the  binomial  a  +  bi«  powers  of  a  and  b  when  n  is 
any  positive  whole  number. 

Solution.  In  order  to  determine  the  binomial  expansion  we  write 


(a  +  b)*  =  (a  +  b)(a  +  b)  ...  (a  +  b), 

where  the  right  side  consists  of  a  product  of  n  identical  parentheses  (a  +  b ).  As  is 
known,  the  multiplication  of  parentheses  consists  of  choosing  one  term  from 


each  parenthesis  and  obtaining  the  product  of  the  terms  chosen,  and  continuing 
this  process  until  all  the  possible  choices  are  exhausted.  Finally,  the  resulting 
products  are  added  together. 

A  product  of  this  sort  has  the  following  appearance: 

P  =  aaibButa»bl>3aa3be9  . . . , 

in  which  the  factor  a  is  taken  from  the  first  cq  parentheses,  the  factor  b  from  the 
next  /i|  parentheses,  the  factor  a  from  the  next  a2  parentheses,  etc.  In  this  case  cq 
+  /?!  +  a2  +  /?2  +  •  •  •  equals  the  number  of  parentheses  present,  i.e.,  n. 

If  we  set  cq  +  a2  +  a3  +  ...  equal  to  a  and  +  P2  +  •••  equal  to  ft  the 
expression  can  be  written  in  the  simpler  form 

P  =  <Pba  with  a  +  /3  =  n. 

Now  the  product  P  can  generally  be  obtained  in  many  other  ways  than  the  one 
described,  for  example,  by  taking  a  from  the  first  a  parentheses  and  b  from  the 
last  ft  parentheses,  or  by  taking  b  from  the  first  ft  parentheses,  and  a  from  the  last 
a  parentheses,  etc.  If  we  assume  that  the  product  P  occurs  exactly  C  times  in  the 
method  described,  C  being  understood  to  represent  an  initially  unknown  whole 
number,  then 


G  =  C<fb* 

represents  one  term  of  the  binomial  expansion.  The  other  terms  have  the  same 
form,  except  that  the  exponents  a  and  ft  and  the  coefficients  C  are  different. 
However,  a  +  ft  always  equals  n. 

The  core  of  the  problem  is  to  determine  the  so-called  binomial  coefficient  C, 
i.e.,  to  answer  the  question:  How  many  times  does  the  product  P  =  aabf  appear 
in  the  binomial  expansion? 

To  answer  this  question  we  first  write  the  factors  a  and  b  of  the  product  one 
after  another  in  the  order  in  which  we  initially  selected  them  from  the 
parentheses: 


aa ...  abb  ...  baa .. .a- 


totaling  totaling  totaling 
a,  &  aa 


This  is  a  permutation  of  n  elements  in  which  a  identical  elements  a  and  /? 
identical  elements  b  occur.  There  are  as  many  possible  permutations  of  these 
elements  as  there  are  terms  P  resulting  from  the  multiplication  of  the  n 
parentheses  (a  +  b ). 

But  the  number  of  permutations  of  n  elements  among  which  there  appear  a 
identical  elements  of  one  kind  and  />  identical  elements  of  the  other  is  n\la\lfi\. 
This  is  how  often  the  product  oab^  appears  in  the  binomial  expansion. 
Consequently, 


An  apparent  exception  to  this  formula  is  presented  by  the  terms  an  and  bn  of 
the  expansion,  each  of  which  occurs  only  once.  To  eliminate  this  exception  let  us 
agree  to  let  the  symbol  0!  represent  unity;  we  are  then  able  to  write  the 
coefficients  of  an  and  bn  as  n\/n\ 0!  and  n\/0\n\,  respectively,  in  agreement  with 
the  formula. 

The  individual  possibilities  of  forming  the  product  P  can  also  be  represented 
geometrically.  We  can,  for  example,  represent  the  first  possibility  considered 
above  in  the  following  way:  We  mark  off  a  horizontal  distance  of  ax  successive 
segments  a,  and  from  the  end  of  this  distance  extend  a  vertical  distance  of  J3X 
successive  segments  b,  from  the  end  of  this  vertical  line  a  third  horizontal 
distance  of  a2  successive  segments  a,  etc.  In  a  similar  manner  we  represent  the 
other  possibilities  of  forming  the  product  P;  however,  we  begin  all  C  zigzag 
traces,  which  represent  the  C  possibilities,  from  the  same  point.  Thus,  for 
example,  if  we  are  concerned  with  finding  the  number  v  of  all  the  products  of  the 
form  an  b1  in  the  binomial  expansion  of  (< a  +  b)18,  we  draw  a  rectangular 
network  of  1 1  •  7  rectangular  compartments  possessing  a  horizontal  side  a  and  a 
vertical  side  b  and  lying  in  seven  1 1  -compartment  rows  one  below  the  other.  The 
possibility  a4b3a7b4  ( a  from  the  first  four  parentheses,  b  from  the  following 
three,  a  from  the  next  seven  parentheses  and  b  from  the  last  four)  is  then 
represented  by  the  unbroken  heavy  line,  and  the  possibility  b2a6b3a2b2a 3  by  the 
line  of  dashes.  The  sought-for  number  v  is  therefore  equal  to  the  number  of  all 
the  possible  direct  paths  leading  from  the  corner  E  of  the  network  to  the  opposite 
corner  F. 


FIG.  1. 


The  formula  previously  found  for  C  thus  also  provides  us  with  the  solution  to 
the  interesting  problem: 

A  city  has  m  streets  that  run  from  east  to  west  and  n  that  run  from  north  to 
south;  how  many  ways  {without  detours)  are  there  of  getting  from  the  northwest 
corner  of  the  city  to  the  southeast  corner? 

Since  there  are  {n  -  1)  west-east  partial  paths  a  and  (m  -  1)  north-south 
partial  paths  b,  the  number  of  all  the  possible  paths  is 


(m  +  n  —  2) ! 

(at  -  l)!(n  -  1)!' 

Now  back  to  the  binomial  theorem! 

Determination  of  the  binomial  coefficient  C  gives  us  immediately  the  sought- 
for  binomial  expansion : 

(a  +  b)n  -  'ZCtfb*  with  C  = 

alpl 

Here  a  and  f>  pass  through  all  the  possible  integral  non-negative  values  that 
satisfy  the  condition  a  +  f  =  n. 

The  expansion  of  {a  +  b )5,  for  example,  gives 


(a  +  b)*  -  *5  +  iTTT  +  3] IT  fl3*a  +  2Tirfl2*3  +  TTiT 064  +  b 5 


or 


(a  +  4) 5  =  a5  +  5a*  b  +  10a36a  +  10aa^3  +  5  ab*  +  bn. 


Instead  of  n !/«!/?!  one  usually  writes 


n(n  —  l)(n  —  2) ...  (n  —  a  +  1) 

1-2-3. ..a 

and  also  abbreviates  this  coefficient  na  (read  as  n  sub  a).  The  expansion  then 
takes  on  a  somewhat  simpler  appearance: 

(a  +  b )"  =  fin  +  n1aH~lb  +  n^an~2b2  +  —  +  bn. 

The  coefficient  nv  is  known  as  the  binomial  coefficient  to  the  base  n  with  index 
v. 

The  binomial  theorem  was  probably  discovered  by  the  Persian  astronomer 
Omar  Khayyam,  who  lived  during  the  eleventh  century.  At  least  he  prided 
himself  on  having  discovered  the  expansion  “for  all  (integral  positive)  exponents 
n,  which  no  one  had  been  able  to  accomplish  before  him.” 

Note.  The  derivation  given  above  is  easily  extended  to  give  the  nth  power 
expansion  of  a  polynomial  a  +  b  +  c  +  ....  The  polynomial  theorem  for  a 
polynomial  consisting  of  three  terms,  for  example,  is 

(a  +  A  +  0"  =  2  aW’ 

where  the  sum  X  includes  all  possible  terms  for  which  the  integral  non-negative 
exponents  a,  f,  y  satisfy  the  condition  a  +  /3  +  y  =  n. 
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Cauchy’s  Mean  Theorem 


The  geometric  mean  of  several  positive  numbers  is  smaller  than  the 
arithmetic  mean  of  these  numbers. 


Augustin  Louis  Cauchy  (1789-1857)  was  one  of  the  greatest  French 
mathematicians.  The  theorem  concerning  the  arithmetic  and  geometric  means 
occurs  in  his  Corns  d' Analyse  (pp.  458-9),  which  appeared  in  1821. 

The  proof  of  the  theorem  that  will  be  presented  here  is  based  upon  the 
solution  of  the  fundamental  problem:  When  does  the  product  of  n  positive 
numbers  of  constant  sum  attain  its  maximum  value? 

We  will  call  the  n  numbers  a,  b,  c,  their  constant  sum  K,  and  their  product 
P.  Experimentation  with  various  numbers  suggests  that  the  product  P  reaches  its 


maximal  value  when  the  numbers  a,b,c,  ...  all  possess  the  same  value  M=  K/n. 

To  determine  the  accuracy  of  this  hypothesis,  we  use  the 
Auxiliary  theorem:  Of  two  pairs  of  numbers  of  equal  sum  the  pair  possessing 
the  greater  product  is  the  one  whose  numbers  exhibit  the  smaller  difference. 

[If  X  and  Y  represent  one  pair  and  x  and  y  the  other,  and  X  +  Y  =  x  +  y,  the 
auxiliary  theorem  follows  from  the  equations 

4 XY  =  (X  +  y)a  —  (X  —  F)3,  4xy  =  (x  +  y)3  -  (x  -  y)3, 

in  which  the  minuends  of  the  right  sides  are  equal  and  the  greater  right  side  is  the 
one  in  which  the  subtrahends  are  smaller.] 

If  the  n  numbers  a,  b,  c,  ...  are  not  all  equal,  then  at  least  one,  a,  for  example, 
must  be  greater  than  M,  and  at  least  one,  let  us  say  b,  must  be  smaller  than  M. 
Let  us  form  a  new  system  of  n  numbers  a',b',c' ...  in  such  a  manner  that  (1)  a'  = 
M,  (2)  the  pairs  a,  b  and  a'  b'  have  the  same  sum,  (3)  the  other  numbers  c\  d\  e\ 
. . .  correspond  to  c,  d,  e,.  ...  The  new  numbers  then  have  the  same  sum  K  as  the 
old  ones,  but  a  greater  product  P'{  =  a'b'c'...),  since  in  accordance  with  the 
auxiliary  theorem  a'b’>  ab. 

If  the  numbers  a',  b',c',  ...  are  not  all  equal  to  M,  then  at  least  one,  let  us  say 
b\  is  greater  (smaller)  and  at  least  one,  say  c'  is  smaller  (greater)  than  M.  Let  us 
form  a  new  system  of  n  numbers  a",  b",  c",  d",  ...  in  such  a  manner  that  (1)  a”  = 
a'  =  M,  (2)  b"  =  M,  (3)  the  pairs  b',  c'  and  b",  c"  possess  the  same  sum,  (4)  d",  e", 
. . .  correspond  to  d\  e',.  . . .  The  numbers  a",  b",  c",  . . .  then  have  the  same  sum  K 
as  the  numbers  a',  b',  c',  ...,  but  possess  a  greater  product  P"  =  a"  b"  c"  ...,  since 
in  accordance  with  the  auxiliary  theorem  b"  c"  >  b'  c'. 

We  continue  in  this  fashion  and  obtain  a  series  of  increasing  products  P,P',P 
",  ...  each  successive  member  of  which  is  greater  than  the  immediately 
preceding  one  by  at  least  one  more  multiple  of  the  factor  M.  The  last  product 
obtained  in  this  manner  is  the  greatest  of  all  and  consists  of  n  equal  factors  M. 
Consequently, 


P  <  Mn, 


which  gives  us  the  theorem: 

The  product  of  n  positive  numbers  whose  sum  is  constant  attains  its  maximal 
value  when  the  numbers  are  equally  great. 

If  we  extract  the  nth.  root  of  the  last  inequality  and  express  P  and  M  in  terms 
of  the  magnitudes  a,  b,  c,  . . .,  we  obtain  Cauchy  s  formula: 


a  +  b  +  c  +  •  •  • 


$  abc . . .  < 

n 

This  is  expressed  verbally  as  follows: 

The  theorem  of  the  arithmetic  and  geometric  mean:  The  geometric  mean 
of  several  numbers  is  always  smaller  than  the  arithmetic  mean  of  the  numbers , 
except  when  the  numbers  are  equal,  in  which  case  the  two  means  are  also  equal. 

Note  1.  Cauchy’s  theorem  leads  directly  to  the  converse  of  the  above 
extreme  theorem: 

The  sum  of  n  positive  numbers  whose  product  is  constant  attains  its  minimal 
value  when  the  numbers  are  equal. 

Proof.  Let  us  call  the  n  numbers  x,  y,  z,  ...,  their  given  product  k,  their 
variable  sum  s,  and  let  us  designate  by  m  the  nth  root  of  k. 

According  to  Cauchy, 


x  +y  +  z  + 
n 


consequently 


s  ^  nm , 

where  the  equality  sign  applies  only  in  the  event  that  x=y  =  z.  Q.E.D. 

The  two  preceding  extreme  theorems  form  the  basis  for  a  simple  solution  of 
many  problems  concerning  maximum  and  minimum  (cf.  Nos.  54,  92,  96,  98). 

Note  2.  Cauchy’s  theorem  also  furnishes  us  directly  with  the  important 
exponential  inequality  for  the  exponential  function  xc. 

If  a  is  any  positive  number  not  equal  to  1 ,  n  a  whole  number  >  0,  m  a  whole 
number  >  n,  then  the  geometric  mean  of  the  m  numbers  of  which  n  possess  the 
value  a  and  the  {m  -  n)  others  possess  the  value  1  is  smaller  than  the  arithmetic 
mean  {na  +  m  -  n)/m  of  these  m  numbers  or 


Van  <  1  +  -  (a  -  1), 

m 


or,  if  we  write  e  in  place  of  n/m, 

(1)  a*  <  1  +  e(a  -  1). 


In  this  inequality  s  is  any  rational,  positive  proper  fraction.  We  will  now  show 


that  this  inequality  is  also  true  for  any  irrational  proper  fraction  i. 

First,  it  is  clear  that  aJ  >  1  +  J(a  -  1)  cannot  be  true  for  any  irrational  proper 
fraction  J.  If  that  were  the  case  it  would  be  possible  to  find  a  rational  proper 
fraction  R  <  J  so  close  to  J  that  aR  would  differ  from  aJ,  and  1  +  R(a  -  1)  from  1 
+  J(a  -  1),  by  less  than — let  us  say — i  of  the  difference  aJ  =  [1  +  J(a  -  1)].  In 
that  event  aR  would  still  be  >  1  +  R(a  -  1),  which  is,  however,  impossible 
according  to  (1). 

Now  let  z  be  so  small  that  i  +  z  and  i  -  z  are  both  positive  proper  fractions. 
Then  we  have 


(since  the  arithmetic  mean  of  the  numbers  az  and  a  z  is  greater  than  1  according 
to  Cauchy)  or 

,  <*+•  +  a t— 

a  < - 2 - 

According  to  the  above  relation;  however, 

a‘  +  *  ^  1  +  (t  +  z)(a  —  1),  a‘“*  ^  1  +  (*  —  z)(a  —  1), 

therefore 


+  a 


j  -• 


1  +  i(a  -  1); 


thus,  it  is  certain  that 


a*  <  1  +  i(a  —  1). 

Inequality  (1)  is  therefore  true  for  any  proper  fraction  e. 

If  we  replace  s  in  (1)  by  V/i,  1  +  s(a  -  1)  by  b,  i.e.,  a  by  1  +  n(b  -  1),  (1)  is 
transformed  into 

(2)  P  >  \  +  n(b  -  1), 

where  /a  is  any  positive  improper  fraction,  b  any  positive  number. 

Conclusion.  The  exponential  inequality.  If  x  is  any  positive  magnitude  and 


c  any  positive  exponent,  the  exponential  inequality  is: 

X6  $  1  +  fix  -  1), 

in  which  proper  fractional  exponents  require  the  use  of  the  upper  sign  and 
improper  fractional  exponents  require  the  use  of  the  lower  sign. 
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Bernoulli’s  Power  Sum  Problem 


Determine  the  sum 


S  =  P  +  2P  +  3P  +  • . .  +  n9 

of  the  p  powers  of  the  first  n  natural  numbers  for  integral  positive  exponents  p. 

The  problem,  posed  in  this  general  form,  was  first  solved  in  the  Ars 
Conjectandi  (Probability  Computation),  which  appeared  in  1713.  It  was  the  work 
of  the  Swiss  mathematician  Jacob  Bernoulli  (1654-1705). 

The  following  elegant  solution  is  based  upon  the  binomial  theorem. 

By  resorting  to  the  device  of  considering  the  magnitudes  s1,  e2,  S 
resulting  from  the  binomial  expansion  of  (x  +  <&)v  as  unknowns  subject  to  v 
certain  conditions  rather  than  as  powers  of  ®,  we  obtain  an  amazingly  short 
derivation  of  S. 

According  to  the  binomial  theorem,  if  P  is  understood  to  represent  the 
number  p  +  1 , 


(*  +  S)F  =  J  +  Pv*  31  +  Pa  vp~1S2  +  ••• 


and 


(*  +  ®  -  l)f  =  i'*’  +  /V(3  -  l)1  +  -  l)a  +  •  •  •. 


Subtraction  of  these  two  equations  gives  us 


(v  +  S)r  —  {v  —  1  +  ©)p 


=  Pv*  +  Pavp-1[iS2  -  (©  -  l)3] 
+  />3VP-3[$3  _  _  1)3]  +  .... 


We  now  define  the  unknowns  S1,  ©2,  <53,  ...  by  the  equations 


(i)  (6  -  i)a  =  ©3,  (2)  (e  -  i)3  =  ©3, 


(3)  (©-!)*-  ©4, 


etc.  This  results  in  the  simplification  of  (I)  to 

(la)  /V  =  (v  +  3)p  -  (*  -  1  +  ©)p. 

This  equation  is  formed  for  v  =  1,  2,  3 and  we  thereby  obtain 

P  I”  =  (1  +  ©)p  -  <5P, 

P2P  =  (2  +  0)p  -  (1  +  3)p, 

P  n p  =  (n  +  ©)p  -  (»  -  1  +  3)p 
Addition  of  these  n  equations  gives  us 

(II)  PS  =  (a  +  S)p  -  3P 

or 

(II)  l*  +  2"  +  • . .  +  +  S|P  ~  ^  with  P  =  p+  1. 

This  formula,  in  which  the  magnitudes  ®2,  ®3,  ...  on  the  right  side  of  the 
equation,  obtained  from  expansion  of  the  binomial  ( n  +  <&f,  are  defined  by 
equations  (1),  (2),  (3),  . . .,  gives  us  the  sought-for  power  sum. 

In  order  to  apply  it  to  the  cases  n  =  1,  2,  3,  4,  we  first  determine  the 
unknowns  31,  ®2,  33,  and  S4  in  accordance  with  equations  (1),  (2),  .... 

From  (1)  it  follows  that  -  28v  +1=0,  i.e.,  S1  =  i.  Then,  from  (2),  -  3S2  +  s1 
-1=0,  i.e.,  32  =  £.  And  from  (3),  -  4S3  +  6S2  -  4S1  +  1=0,  i.e.,  33  =  0.  Finally, 
from  (3-  l)5  =  'S5  we  obtain  84  =  -  A-  The  numbers  31  =  i,  32  =  £,  83  =  0,  (S4  = 
-  3^,  etc.,  are  known  as  Bernoulli  numbers. 

Then  from  (II)  we  obtain 


l+2  +  3+  ---+n  = 


(m  +  3)2  —  32  w2  +  2/iS1 


n  +  1 


l2  +  22  +  32  +  ...  +  « 


a  _  (n  4-  3)3  —  33  _  n3  +  3n2B1  +  3n32 


W»  +  1)(2«  +  1), 


l3  +  23  +  33  + 


(„  +  «)«  _  S* 

•  +» - 4 - 


n4  -f  4«331  +  6n232  +  4n33 


l4  +  2*  +  34  + 


/  n  +  1\2 
■("—)’ 

,  («  +  S)s  -  5s 

5 

n5  +  5«43x  +  10m332  +  10n233  +  5n34 


pst 

30’ 


with  p  =  n(n  +  1),  s  =  2n  +  1,  t  =  3p-  1. 

If  n  in  (II)  increases  without  limit,  also  increases  without  limit,  but  the 
quotient  S/np  possesses  a  finite  value.  In  fact,  in  accordance  with  the  binomial 
theorem,  (II)  is  written 


PS  =  np  +  />131np“l  +  PaSV"2  +  •  •  •, 


so  that 


S  1  Px3*  Pa32 
n<’  P^Pn^Pn2 

Now,  if  n  increases  infinitely  all  the  fractions  on  the  right-hand  side  with  the 
exception  of  the  first  become  infinitely  small,  and  we  obtain  the  limit  equation  of 
the  power  sum: 


(III) 


lim 

II-*® 


1*  +  2”  +  • 


+  np 


ny 


1 

P  +  1’ 


This  important  limit  equation  can  also  be  derived  from  the  exponential 


inequality  (No.  10) 


xp  >  1  +  P(x  -  1). 

This  derivation  has  the  advantage  over  the  one  just  given  that  it  is  true  for  any 
positive  exponent p,  not  only  for  integral  positive  exponents! 

If  we  first  replace  x  in  the  exponential  inequality  with  the  improper  fraction 
V/v,  then  by  the  proper  fraction  v/V,  after  elimination  of  the  denominators  we 
obtain 


VF  >  tf  +  Pv’(V  -  v)  and  if  >  Vp  -  PV’(V  -  v) 


or 


Vr  -  r? 

Pi f  <  — -  <  PV. 

V  —  v 

Into  this  new  inequality  we  introduce  the  series  1|0,  2\\,  3|2,  n\n  -  1  for 
the  pair  of  values  V\v  and  we  obtain 

P  0P  <  lp 
P.\p  <  2r 

P‘{n  -  l)p  <  np 

Addition  of  these  n  inequalities  results  in 

P(S  -  «p)  <np  <  PS 


-0P  <  P  I’, 

-  lp  <  P  2P, 

-  [n-  l)p  <  P  tf. 


or 


2  5  ]_  1 

P  <  np  <  P  +  n* 

Since  both  boundaries  between  which  the  quotient  S/nP  is  situated  assume  the 
value  1/P  when  n  =  oo, 


(III) 


lim 

n—  ® 


lp  +  2P  + 


+  np 


P+  1 


where  p  represents  any  positive  magnitude. 

If  the  mean  value  of  the  function  xp  is  introduced,  the  limit  equation  of  the 
power  sum  can  be  obtained  in  still  another  form. 

The  mean  value  of  a  function  over  an  interval  is  commonly  understood  to 
mean  the  limiting  value  toward  which  the  mean  value  of  n  values  of  the  function 
uniformly  distributed  over  the  interval  tends  if  n  increases  without  limit.  The 
mean  value  M  of  the  function  f(x)  over  the  interval  0  to  x,  if  8  represents  the  nth. 
part  of  x,  is  thus  the  limiting  value  of  the  quotient 

f{h)  +/(2S)  +  ...  +  f(n8) 
tl~  n 

X 

for  n  =  oo.  We  write  this  mean  value  as  9ft/{x). 

0 

Thus,  the  mean  value  of  the  function  xp  over  the  interval  0  through  x  is  the 
limiting  value  of 

8»  +  (28)”  +  . . .  +  («$)’  ,p  lp  +  2 ”+...+ n” 
n  n 

i.e.,  since  3  =  x/n,  the  limiting  value  of 

.  lp  +  2P  +  •••  +  n” 

*  =  xP - rfTl - 

Since  the  fractional  factor  of  the  right  side  according  to  (III)  has  the  limiting 
value  l/(p  +  1),  it  follows  that  the  sought-for  mean  value  of  the  function  xp  is 

<m*>  f ’  -  pt r- 

this  formula,  however,  is  basically  no  different  from  (III). 

Formula  (III)  or  (Illn)  has  found  many  applications  in  geometry  and  physics. 
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The  Euler  Number 


Find  the  limiting  values  of  the  functions 


for  an  infinitely  increasing  x. 

The  simplest  solution  of  this  very  interesting  problem  is  based  upon  the 
exponential  inequality 


X*  <  1  +  e{x  -  1) 


(cf.  No.  10),  in  which  x  is  any  positive  magnitude  and  s  is  any  proper  fraction 
between  0  and  1 . 

Let  us  introduce  two  arbitrary  positive  numbers  a  and  b,  the  first  of  which  is 
larger  than  the  second  and  the  second  >  0,  and  introduce  into  the  exponential 
inequality  first 


X  =  1  +  7* 


b 


e  =  -> 
a 


and  then 


1  b  +  1 

— r»  e  =  - r 


(1  \  bfa  1 

1  +  -)  <  1  +  -  or 


(1) 


the  second  (l  -  7 — A  <  1 - T 

\  b  +  if  a  +  1 


j  \e  +  l/a+i 


in 


or,  finally, 


(2) 


The  two  inequalities  obtained,  (1)  and  (2),  contain  the  remarkable  theorem: 


With  an  increasingly  positive  argument  x  the  function  <p(x)  = 

(]\  x  +  l 

1  +  -  I  decreases. 

Thus,  for  X  >  x 


(p(X)  >  ?>(x),  whereas  0(2Q  <  <D(jc). 

Since,  on  the  other  hand,  for  the  same  values  of  the  argument  the  function  O 
exceeds  the  function  cp 

[®M  -  (l  +  ;)■?«]• 

we  obtain  the  inequalities 

<p(x)  <  <p(X)  <  cD(Af)  and  ?(*)  <  <!>(*)  <  <D(*), 

i.  e.,  every  value  of  the  function  O  is  greater  than  every  value  of  the  function  cp. 
(Only  positive  values  of  the  argument  will  be  considered.) 

Let  us  imagine  two  movable  points  p  and  P  on  the  positive  number  axis 
which  are  situated  at  distances  <p(t)  and  0(0  from  the  zero  point  at  time  t  and 
begin  their  movements  in  the  instant  t  =  1.  Point  p ,  beginning  from  (p{  1)  =  2,  then 
moves  continuously  toward  the  right,  while  point  P,  which  begins  at  0(1)  =  4, 
moves  continuously  toward  the  left.  However,  since  0(0  is  always  greater  than 
(p{t),  i.e.,  P  is  always  to  the  right  of  p,  the  points  can  never  meet.  Nevertheless, 
the  distance  between  them  is  diminished 


d  -  0(0  -  *>(/)  =  »>(/)//, 

since  (pit)  <  4,  and  thus  d  <  4 It  without  limit  with  increasing  time,  so  that  they 
finally  are  separated  by  an  infinitely  small  distance. 

The  only  way  to  explain  this  situation  is  to  assume  that  on  the  number  axis 
(between  the  numbers  2  and  4)  there  exists  a  fixed  point  that  the  moving  points  p 
and  P  approach  infinitely  closely  from  the  left  and  from  the  right,  respectively, 
without  ever  touching.  The  distance  of  this  fixed  point  from  the  zero  point  is  the 
so-called  Euler  number  e.  The  proposal  to  designate  this  number,  which  also 
forms  the  base  of  the  natural  logarithmic  system  (No.  14),  by  the  letter  e  stems 
from  Euler  ( Commentarii  Academiae  Petropolitanae  ad  annum  1739,  vol.  IX). 


The  important  inequality 


®  (*  ♦  ir  <  •  <  e  ♦ r 

w  trwe /or  Euler ’s  number  (x  represents  any  positive  number  >  0). 

If  we  choose  x  =  1 ,000,000,  this  inequality  gives  us  the  number  e  exactly  to 
five  decimal  places.  However,  the  use  of  the  series  for  e  (No.  13)  is  a  better 
method  of  computation. 

Then  we  obtain 


*  =  2.718281828459045.... 

The  sought-for  limiting  values,  however,  are 

lim  / 1  -f  -)  =  e  and  lim  ( 1  +  -\  =  e, 

x—n  \  X1  *— »  \  Xl 

the  first  of  which  is  an  upper  limit,  while  the  second  is  a  lower  limit. 

Note.  From  the  inequality  (I)  for  the  number  e  the  inequality  for  the 

exponential  function  ex  follows  directly. 

1 .  In  the  inequality 


K)‘<‘ 

we  replace  x  by  VP,  where  P  is  any  positive  number  >  0;  we  assign  to  e  the 
power  P  and  obtain 

(1)  er  >  l  +  P. 


2.  In  the  inequality 


t  < 


we  replace  x  +  1  by  -  1  In,  thus  1  +  I  by  j— ,  n  being  a  negative  proper  fraction 
7^  0;  we  assign  to  e  the  power  n  and  obtain 


(2) 


en  >  1  +  n. 


3.  We  consider  that  for  every  negative  improper  fraction  N  (\  +  N)  is 
negative,  and  consequently  we  have 

(3)  ?  >  1  +  N. 

Combining  the  inequalities  (1),  (2),  (3),  we  obtain  the  inequality  of  the 
exponential  function: 


e*  >  1  +  x, 

which  is  true  for  every  finite  real  value  of  x  and  only  becomes  an  equation  when 
x  =  0. 

The  inequality  obtained  leads  directly  to  the  so-called  limit  equation  of  the 
exponential  function. 

Let  x  be  any  finite  real  magnitude  and  n  a  positive  number  of  such  magnitude 
that  1  ±  -  is  positive.  In  accordance  with  the  inequality  of  the  exponential 

function, 


>  1  +  -  and  e~xln  >  1  — 
n  n 

We  assign  these  inequalities  the  power  n,  in  the  case  of  the  second,  however, 
only  after  we  have  multiplied  it  by  1  +  ^  This  results  in 


**  > 


/,  x\*  , 

.  *3\' 

(  + ;)  and  1 

(' +  y  *  s  ( 

-?) 

Since  the  right-hand  side  of  the  second  inequality,  in  accordance  with  the 

x 2 

exponential  inequality  (No.  10),  is  greater  than  1  -  — ,  then  actually 

n 


MM-*) 


e*. 


<  e*. 


Combining  the  inequalities  obtained,  we  get 


If  n  is  then  allowed  to  increase  infinitely,  the  left  side  of  this  inequality  is 
transformed  into  ex  and  we  obtain  the  limit  equation  of  the  exponential  function: 

!Lm.  ('  + ;)"  -  **• 

in  which  x  represents  any  finite  real  number  and  n  is  an  infinitely,  increasing 
magnitude. 
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Newton’s  Exponential  Series 


Transform  the  exponential  function  ex  into  a  progression  in  terms  of  powers 

of x. 


This  power  progression,  the  so-called  exponential  series,  which  may  in  fact 
be  the  most  important  series  in  mathematics,  was  discovered  by  the  great 
English  mathematician  and  physicist  Isaac  Newton  (1642-1727).  The  famous 
treatise  that  contains  the  sine  series,  the  cosine  series,  the  arc  sine  series,  the 
logarithmic  series,  and  the  binomial  series  as  well  as  the  exponential  series  was 
written  in  1665  and  bears  the  title  De  analyst  per  aequationes  numero 
terminorum  infinitas.  Newton’s  derivation  of  the  exponential  series  is,  however, 
not  rigorous  and  rather  complicated. 

The  following  derivation  is  based  upon  the  mean  values  of  the  functions  xc 
(No.  11)  and  ex. 

We  find  the  mean  value  of  the  function  ex  with  the  help  of  the  inequality  of 
the  exponential  function 


(1)  <“>!+«.  (No.  12) 

We  will  consider  two  arbitrary  values  v  and  V  =  v  +  cp  >  v  of  the  argument  of 
the  exponential  function  and  first  set  u  =  (p  and  then  u  =  -  cp  in  (1).  This  gives  us 


e*  >  l  +  <p  and  t~*  >1—9,  respectively. 


Multiplication  with  ev  and  ev,  respectively,  results  in 


ey  >  ev  +  <pev  and  e”  >  ev  —  <pev ,  respectively; 


combining,  we  obtain: 


(2) 


tv  < 


ev  -  e° 
V  -  v 


<  e’ 


The  mean  value  M  of  e*  over  the  interval  0  to  x  is  the  limiting  value  of  the 
quotient 


e*  +  e2i  +  e3i  -f  •  •  • 


(-3 


for  an  unlimitedly  increasing  n.  In  order  to  find  //,  for  a  positive  x  we  set  down  in 
(2)  for  the  pair  of  values  u|  Fin  succession 

0|8,  8|2S,28|38,...f  (n  -  l)8|n8 

and  add  the  resulting  n  inequalities.  This  gives 

i  .  r*  —  1 
rip  +  1  —  e*  <  — ^ —  <  rifi 

or,  solved  for  n 


e“  -  1  e*  -  1  .  e*  -  1 

-  <  n  <  -  +  - 


(*  >  o). 


For  a  negative  x  we  put  down  successively  for  u|  V  in  (2) 

8|0,28|8,  38|28,...,«S|(n  -  1)8. 

Summation  of  the  resulting  n  inequalities  then  leads  to  the  same  final  inequality; 
only  in  this  case  the  extremes  are  reversed,  so  that  this  time  it  reads 


e*  -  1  ex  -  1  e*  -  1 

-  + -  <  jU  < - 


(x  <  0). 


If  we  then  allow  n  to  become  infinite  in  the  two  inequalities  obtained,  we  get 
for  the  lim  n  the  value 


X  gX  _  1 

We*  =  - . 

o 


(3) 


x 


whether  x  is  positive  or  negative. 

Now  for  the  series  expansion  of  e*! 
We  begin  with  the  inequality 


ex  >  1  +  x. 

We  assume  initially  that  x  is  positive  and  obtain  the  mean  values  of  both  sides. 
This  gives  us 


e*  —  1  .  x  _  .  x‘ 


>1+-  or  **>!+*  +  fjy 


Repeated  mean  formation  gives  rise  to 


**  -  1 


X  X 


X2  X 3 


>  1  +  2  +  3!  °r  **  >  1  +  *  +  51  +  3!' 


2!  '  3! 


We  continue  in  this  manner  and  obtain 


(4) 


X2  X 3 

*x>l+*  +  2i  +  3l  + 


In  order  to  obtain  an  upper  limit  for  e*  also  we  begin  with  the  inequality 


e~x  >  1  —  x, 


multiply  by  e*  and  obtain  1  >  ex-  xe x  or 

ex  <  1  ■+■  xex. 

In  the  subsequent  mean  formations  we  employ  the  self-evident  theorem:  “The 
mean  of  the  product  of  two  (positive)  functions  u  and  v  is  smaller  than  the 
product  of  the  mean  value  of  u  and  the  maximum  value  of  v  over  the  interval 
considered.” 

In  the  first  step  (u  =  x,  v  =  e*)  we  obtain 


**  -  1  .  X 

-  <1+0^ 

x  2 


or 


r*  <  1  +  x  + 


(x 2 
v  =  —,  v  =  ex 


etc.,  and  finally 


r*  —  1  .  .  x  x 


x *  x3 


<1+2  +  3!^  °r  ‘x  <  1  +  *  +  T\  +  T<e*> 


2!  3! 


(5) 


X3  X3  X* 

^<l+A;  +  _+_  +  . 


If  we  then  consider  the  case  in  which  x  is  negative,  the  situation  is  somewhat 
simpler. 

From  e*  >  1  +  x  it  follows  as  above  that 


ex  —  1  ,  x 

-  >  1  +  Jjl 

x  2 


however,  since  x  is  now  negative, 


**  <  1  +  *  +  2!* 


The  next  mean  formation  yields 


the  next 


e*  —  l  x  x2  ,  x2  x3 

— ~  <  1  +  2  +  3!  °r  ' *  >  1  +  *  +  2!  +  3!’ 


_  ,  X3  X3  X* 

e  <1+*  +  2i  +  3i+4!’ 


etc.,  and  finally 

(6)  r*>l+x  +  —  +  jj  + 


x2  x3 


,.2v  —  1 


(2,-  1)! 


and 


(7) 


X ‘  X3 


„2i 


^<l+x  +  2!+3!  +  **-  +  (2^f 


From  inequalities  (4),  (5),  (6),  and  (7)  it  follows  that: 


When  x  is  positive  e*  lies  between 


1  +  *  +  21  + 


x" 

•  +  — ;  and 
r! 


x2 

1  +*  +  2! 


and  when  x  is  negative  between 


1  +  x  +  2l  + 


xn 

+  —  and 
n! 


x2 

1  +  *  +  2!  + 


+ 


xn  +  1 

(»+  1)!' 


Then  if  we  write 


(8) 


=  1  +  x  +  2l  + 


xn 

+  — r» 
n! 


the  error  encountered  for  a  positive  value  of  x  is  less  than 


r! 


(‘x  ~  >), 


and  for  a  negative  value  of  x  less  than 


(r+  1)! 


But  for  a  finite  value  of  x  and  for  an  infinitely  increasing  n  the  fraction  x'Vn ! 
approaches  zero.  [In  accordance  with  No.  10  each  of  the  products  2{n  -  1),  3 (n  - 
2),  1)  •  2  is  greater  than  1  •  n.  The  product  of  these  products  is  therefore 

greater  than  nn  ~  2,  i.e.,  (n  -  l)!2  >  n”  ~  2  or  n\2  >  nn  or  n\  >  Vr"-  Thus,  it  follows 
that 


If  n  is  assigned  a  value  such  that  is  greater  than  |2x|,  then 


x" 

r! 


and 


lim  £  =  0.] 

n -♦  no  til 


The  error  encountered  with  formula  (8)  thus  disappears  as  x  increases 
infinitely.  Consequently: 

The  progression 


(9) 


ex 


1+*  +  2! 


is  true  for  every  finite  x. 

Note.  The  series  obtained  is  particularly  well  suited  for  computation  of  the 
Euler  number  e.  If,  for  example,  we  set  x  equal  to  1, 


'=1+T!  +  £!  +  ,’*+KH  =  2-7182818012 


and  the  encountered  error  is 

_  1  1  1  1  /.  ,  1  1  ,  \ 
h  ~  11!  +  12!  +  13!  +  *“  “  11!  \  +  12  +  12  13  +  " 

which  is  smaller  than 

_1_  /  _1_  1  1  \ 

11!  V  +  12  +  12a  +  123  +  “  7 

or  smaller  than 

yyy  <  0.00000008. 

The  exact  value  is  e  =  2.71828182845904523536  .... 

Formula  (9),  which  applies  to  every  finite  real  value  of  x,  suggests  the  further 
extension  of  the  concept  of  the  exponential  function  to  include  the  complex 
argument  values  z. 

The  exponential  function  ez  for  the  complex  argument  z  is  defined  by  the 
formula 

2  3 

(10)  **=1+z  +  ^l  +  fl  +  ”’  10  infinity- 

It  is  easily  seen  that  the  infinite  power  series  on  the  right-hand  side  of  (10) 
has  a  definite  finite  value  for  every  finite  z,  or,  in  other  words,  that  the  series 
converges  for  every  finite  z: 

We  set 


i  +  z  +  2]  +  *”  +  in  = 


,n*i 


2! 
+  2 


(n  +  1)!  '  («  +  2!) 


+  • 


(n  +  v)\ 


=  *v(z), 


so  that 


£.♦»(*)  ~  ^»(Z)  =  *v(Z)- 

If  C  represents  the  absolute  magnitude  of  z,  then  the  absolute  magnitude  of  Rv(z) 
must  certainly  be  smaller  than 

£n  +  i  Zn+2  ?+' 

(n  +  1)!  +  (n  +  2!)  +  ***  (r  +  v)f 
and  consequently  considerably  smaller  than 

(^-Tj! +  (Z^JT  +  - - • to  -  f  ~  £-(0- 

Since,  in  accordance  with  (8)  or  (9),  e 4  -  En(Q  can  be  made  as  small  as  desired 
with  the  selection  of  a  sufficiently  high  value  for  n,  Rv(z )  can  certainly  be  made 

as  small  as  desired  for  such  an  n,  no  matter  how  great  the  value  of  v.  However, 
this  means  that  the  series 


.  z 2  z3 

1  +  2  +  2!  +  V.  +  *  ’ ' 

converges.  (It  is  in  fact  absolutely  convergent,  i.e.,  it  still  converges  when  z  is 
converted  into  its  absolute  magnitude  £) 

Moreover,  let  a  and  b  be  two  arbitrary  real  or  complex  values,  a  and  /?  their 
absolute  magnitudes,  and  a  +  J3  =  y.  By  multiplication  of 

_  .  .  .  a  a2  a " 

£«(<*)  =  ,  +  n  +  2'!  +  '’‘+nI 


/t.  ,  b  b2 

^n(^)  =  1  +  -j-j  +  2!  + 


and 


we  obtain  En{a)En(b )  =  1  +  Cx  +  C2  +  . . .  +  C2n,  Cv  representing  the  sum  of  all 

nrhs 

the  members  of  the  form  _  in  which  the  exponents  r  and  s  have  the  sum  v.  As 

r!j! 

long  as  v  does  not  exceed  the  value  of  n,  all  v  +  1  positive  index  pairs  (r,  s)  occur 
in  Cv  with  the  sum  v,  whereas  when  v  >  n  only  some  of  them  do.  Consequently, 
according  to  the  binomial  theorem  (No.  9) 


for  v  g  n  Cr  —  —  (a  -h  b )*, 

for  v  >  n  |Cr|  <  i  y\ 


The  sum  of  the  first  (n  +  1)  terms  of  En(a)  En(b )  is  therefore  equal  to  En(a  +  b), 
and  the  sum  of  the  absolute  magnitudes  of  the  following  n  terms  is  smaller  than 
Rn(y),  i.e.,  is  certainly  smaller  than 


y" 


+  1 


ji  ♦  2 


(«  +  1)!  (n  +  2)! 


+  to  infinity  =  e1  —  £n(y)  =  8, 


so  that  we  can  set  it  equal  to  eS,  where  |e|  <  1 . 
Accordingly,  we  obtain  the  equation 


En(a)-En(b)  =  En(a  +  b)  +  .5. 

If  we  then  allow  n  to  become  infinite  in  this  equation,  S  becomes  equal  to  zero, 
and  the  equation  is  converted  into 


(11)  ea-eb=*ea*b. 

This  fundamental  formula  justifies  our  previous  suggestion  of  designating  the 
series 

,  z3  z3 

+  Z  +  2!  +  3!  +  '  ’  * 


as  e z. 

Now  let  z  =  x  +  iy,  where  x  andy  are  real.  According  to  (1 1),  e2  =  e*  •  eiy  or 


e*le* 


d*  =  i  +  iy  -  fi 


3!  4!  5!  6! 


The  brackets  appearing  here  are,  in  accordance  with  No.  15,  cos y  and  sin  y, 
and  we  obtain  the  Euler  formula: 


(12)  +  —  r*(cosy  +  isiny), 

which  when  x  =  0  takes  the  form 


(12a)  f9  =  cosy  +  isiny. 

If  in  (12a)  y  =  7t,  we  obtain  the  remarkable  Euler  relation 

i*  =  -1 

between  the  two  significant  numbers  e  and  n. 

If  we  then  replace  y  by  -y  in  (12a),  we  obtain 


(12  b)  e~lv  =  cosy  —  isiny 

and  subsequent  addition  and  subtraction  of  (12a)  and  (12 b)  yields  the  equally 
remarkable  pair  of  formulas 


cosy  = 


^  + 


■<y 


siny  = 


t"  - 


-<* 


2i 
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Nicolaus  Mercator’s  Logarithmic  Series 


To  calculate  the  logarithm  of  a  given  number  without  the  use  of  the 
logarithmic  table. 


This  fundamental  problem,  which  forms  the  basis  for  the  construction  of  the 
logarithmic  tables,  is  solved  simply  and  conveniently  by  logarithmic  series.  The 
simplest  logarithmic  series: 


X  -  ^x2  +  I*3  -  \x*  + - , 


which  represents  the  natural  log  of  1  +  x,  is  found  for  the  first  time  in  the 
Logarithmotechnia  (London,  1668)  of  the  Holstein  mathematician  Nicolaus 
Mercator  (1620-1687)  (whose  real  name  was  Kaufmann).  For  the  derivation  of 
the  logarithmic  series  we  will  make  use  of  the  mean  value  of  the  function 


/(*)  = 


1 

1  +  *’ 


which  we  will  therefore  determine  first. 


We  will  begin  with  the  inequality  (2)  for  the  above  number;  we  begin  by 
converting  this  inequality  into  an  inequality  for  the  logarithmic  function  nat  log  x 
(nat  log  x,  abbreviated  as  lx,  is  the  logarithm  of  x  when  Euler’s  number  e  is  taken 
as  the  base  of  the  logarithmic  system,  i.e.,  the  logarithm  is  the  power  of  e 
required  to  obtain  x). 

Consequently,  we  replace  v  and  V  with  lu  and  IU,  where  U  >  u  >  0,  and, 
correspondingly,  ev  and  with  u  and  U.  This  gives  us 


u  < 


U  -  u 
IU  -  lu 


<  u 


or 

1  IU  -  lu  1  ... 

o  U <  <Z  (£/>»>  0). 

The  mean  value  of  the  function  f(x)  =  1/(1  +  x)  is  the  limiting  value  of  the 
fraction 


/(8)  +/(28)  +  ...  +/(«S) 

n  — 

n 

for  an  infinitely  increasing  n  and  S  =  x/n. 

To  determine  lim  g  for  positive  and  negative  values  of  x,  respectively,  we 
write  1  +  vS\\  +  (v  -  1)<5  in  (1)  for  the  pairs  U\u  and  u\U,  respectively,  and  then 
form  (1)  for  v  =  1,  2,  3,  ...,  n.  Addition  of  the  resulting  n  inequalities  gives  in 
both  cases: 


/(I  +  x) 
8 


lies  between  n/j. 


and 


n/i  + 


x 

TT~x 


in  other  words, 


lies  between 


/(I  +  x) 


l(l+x) 


X 


p 


and 


n(l  +  x ) 


Thus,  if  n  becomes  infinite,  it  follows  that 


(2) 


2R 


o 


1 

1  +  x 


/(I  +  x) 

- > 

X 


where  (1  +  x)  is  naturally  to  be  considered  positive. 

Now  for  the  derivation  of  the  series  for  /(I  +  x)! 

If  we  replace / on  the  right-hand  side  of  this  equation  with  1  -  xf,  we  obtain 


/  =  1  -X  +  X2f 


If  we  again  replace / on  the  right-hand  side  by  l  -  xf,  we  obtain 

/  =  l  -  *  +  x2  -  x3f. 


Similarly,  from  this  we  obtain 


/  =  1  -  x  +  x3  -  x3  +  x4/ 


etc.,  and  in  general: 

/  =  1  —  x  +  x3  —  x3  +  x4  — | - -ex*-1  +  ex"/, 

where  e  is  equal  to  +  1  for  even  values  of  n  and  -  1  for  uneven  values  of  n. 
Obtaining  the  mean  value  from  this  formula,  we  have 


(3)  ^  +  =  1 


X  X2  X3 

2  +  "3  ”  T  +  “ 


„n  - 1 


e -  +  e2Rx"/. 


If  F  represents  the  maximum  value  assumed  by / over  the  interval  0  to  x  (thus 
F  =  1  for  positive  values  of  x,  F  =  1/(1  +  x)  for  negative  values  of  x),  then  in 
terms  of  the  absolute  value  the  mean  value  of  xnf  must  be  smaller  than  the  F- 
value  of  the  mean  value  [xn/(n  +  1)]  of  xn.  Accordingly,  we  are  able  to  write 


2ttx"/=  QF-^-rr 
J  n  +  1 


where  0  is  a  definite  positive  proper  fraction. 


This  converts  (3)  into 


/(l  +  x) 


with  R 


eQF 


Xn  +  1 

n  +  f 


xn 

e-  +  R 
n 


As  n  approaches  infinity,  if  x  is  a  proper  fraction  (also  when  x  =  +  1)  the 
“residue”  R  tends  toward  zero. 

Consequently,  the  following  progression  is  valid  when  x  is  a  proper  fraction 
and  when  x  =  1  : 


(4) 


/(l  +  x)  =  x 


The  series  on  the  right-hand  side  of  the  equation  is  Mercator’s  series. 

Since  it  is  only  valid  for  proper  fractional  values  of  x,  it  is  not  suited  for 
computing  the  logarithms  of  any  number  whatever.  In  order  to  obtain  the  series 
required  for  this,  we  substitute  in  (4)  -  x  for  x  and  obtain 


(5) 


/(I  -x)  -  -X-j 


x^  _  x* 

3  4 


Subtracting  (5)  from  (4)  gives  us 


,  i  +  x  r  x3  x6 

“T+  3  +  5  +  - } 


For  every  positive  or  negative  proper  fractional  value  of  x,  X  = 

X  —  l 

while  at  the  same  time  *  =  — — formula  obtained  is  written 

X  +  1’ 


1  +  x 

1  -  X 


is  positive, 


(6) 


IX  =  2[x  +  ^x3  +  }x®  +  •  •  •]  with  x  = 


X  -  1 
X  +  \ 


This  new  series  converges  for  every  positive  X. 

In  this  series  we  substitute  for  X  the  quotient  Z/z  of  two  arbitrary  positive 
numbers  (>  0).  This  gives  us 


(7) 


r/Z  -lz  =  2 [Q  +  IQ3  +  j<?5  +  + 


...] 


with 


<?  = 


Z  -  z 
z+  z 


This  series,  in  which  Z  and  z  may  be  any  two  positive  numbers,  is  the 
logarithmic  series  from  which  the  logarithmic  tables  can  be  computed. 

In  order,  for  example,  to  compute  12  we  set  z  equal  to  1  and  Z  to  2,  which 
gives  us 


12 


■*6 


+ 


1 


1 


3-33  5 -35  '  7-37  '  / 

In  order  to  compute  15  we  set  z  =  125  =  53,  and  Z  =  128  =  27,  and  this  gives  us 


7/2  -  3/5  =  2 (Q  +  iQ3  +  iQ8  +  •  •  •)  with  Q  =  jh* 


To  compute  13  we  assume  that  z  =  80  =  5  •  24,  Z  =  81  =  34,  so  that  Iz  =  15  +  4/2, 
1Z  =  4/3.  This  gives  us 

4/3  -15-  4/2  =  2 (Q  +  *Q3  +  +  •  •  ■)  with  Q  =  t£t- 

To  compute  11  we  set  z  equal  to  2400  =  25  •  52  •  3,  Z  =  2401  =  74,  and  obtain 

4/7  -  5/2  -  2/5  -  /3  =  2(Q  +  4<?3  +  |Q5  +  •  •  ■) 
with  Q  =  snnrr. 

The  series  in  the  parentheses  converge  very  rapidly,  i.e.,  we  require  relatively 
few  terms  to  obtain  their  sum  fairly  exactly. 

Note.  The  common  logarithms  to  the  base  10  are  computed  from  the  natural 
logarithms.  From 


10***  =  (=  x) 

it  follows  in  terms  of  the  natural  logarithms  that 

log  x-/10  =  lx 


or 


log  x  =  Mix, 


where 


M  =  ^  =  0.4342944819 

is  the  so-called  modulus  by  which  the  natural  logarithm  must  be  multiplied  to 
give  the  common  logarithm. 
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Newton’s  Sine  and  Cosine  Series 


Compute  the  circular  functions  sine  and  cosine  of  a  given  angle  without  the 
use  of  tables. 


The  simplest  way  of  carrying  out  the  required  computation  is  with  the  use  of 
the  sine  and  cosine  series. 

The  series  for  sin  x  and  cos  x  first  appeared  in  Newton’s  treatise  De  analysi 
per  aequationes  numero  terminorum  infinitas  (1665-1666).  (No.  13.)  The  sine 
series  appears  there  as  the  converse  of  the  arc  sine  series,  which  today  is  a  very 
uncommon  approach. 

The  derivation  of  the  sine  and  cosine  series  presented  here  is  based  upon  the 
mean  values  of  the  functions  sin  x  and  cos  x  over  the  interval  0  through  x.  (All  of 
the  angles  mentioned  in  what  follows  are  considered  in  circular  measure .) 

The  mean  value  M  of  the  function  sin  x  over  the  interval  0  through  x  is  the 
limiting  value  of  the  quotient 


sin  8  +  sin  28  +  ■  •  •  +  sin  n8 

fi  = - 

n 

for  an  infinitely  increasing  integral  positive  n,  where  S  represents  the  nth  part  of 
x. 

But  the  numerator  of  the  quotient*  possesses  the  value 

8 

sin  n  ^ 

“TT* 

sm  ^ 


sin  m  ■ 


where  m  is  the  arithmetic  mean  of  the  n  argument  values  8,28,  ...,  n8,  i.e., 


n  +  1  _  x  8 
2  8  "  2  +  2 


Consequently, 


.  .  X 

sin  m  sin  - 

^  = - — * 

n  sin  ^ 

Since  the  denominator  of  the  fraction  on  the  right-hand  side  tends  toward  the 
limit  ±x  as  n  becomes  infinitely  great,*  and  the  lim  m  is  also  equal  to  ±x,  we 
obtain 


sin  *-  sin  £ 
M  =  lim  a  = - = - = 

»-» 00  X 

2 


or 


(1) 


x 


sJJl  sin  x 
o 


1  —  cos  x 
x 


By  the  same  route,  with  the  use  of  the  formula 

cos  8  +  cos  25  +  •  •  •  +  cos  nS 


we  obtain 

an  sin  * 

(2)  SJJ(  COS  X  - - 

0  X 

The  series  for  sin  x  and  cos  x  are  now  very  easily  found.  Starting  with  the 
inequality 


n5 


sin 


=  cos  m- 


cos*  <  1, 


we  obtain  the  mean  value  for  both  sides  and  we  have 


sin  x  . 

-  <1  or  sin  *  <  x. 

x 


If  we  once  again  obtain  the  mean  values  (Formula  [1]  and  No.  1 1)  we  obtain 


1  —  COS  X  1 

-  <  O  * 

x  2 


X 3 

or  cosx  >  1  —  — • 


By  again  obtaining  the  mean  value  we  get 


sin  x  ,  x2  .  x3 

—  >1“Jj  or  sm  *  >  x  -  — . 


etc.  This  results  in: 


sin  x  <  x 

x3 

sin  x  >  x  —  — 

J  • 

X3  X* 

SU1*  <  *  ”  3l  +  5! 

X3  X®  X7 

sin  x  >  x  —  jj  +  jrj  —  j-f 
etc. 

The  integral  rational  functions  on  the  right-hand  side  of  these  inequalities  are  the 
1st,  2nd,  3rd,  ...,  vth  approximations  of  the  functions  sin  x  and  cos  x.  They  are 
called  approximations  because  the  degree  of  their  deviation  from  the  correct 
circular  function  grows  progressively  smaller  as  the  index  v  becomes  higher  and 
can  be  made  as  small  as  desired  if  v  is  sufficiently  great.  Specifically,  each  of  the 
two  circular  functions  lies  between  two  successive  approximations  of  the  true 
value.  Thus,  if  we  set  them  equal  to  one  of  these  two  approximations,  the  error 
incurred  is  smaller  than  the  difference  between  the  approximations,  which  has 
the  form  xv/v!.  The  fraction  xv/v!,  however,  tends  toward  zero  as  v  becomes 
infinitely  great  (No.  13). 

Accordingly,  the  following  progressions 


cosx  <  1 
cosx  >  1 

cosx  <  1 

cosx  >  1 


2! 

X3  X* 

2!  +  4! 


X-4  X* 

2!  +  4! 


6! 


sin  *  =  x  — 

cos  x  =  1  — 


are  valid  for  finite  values  of  x. 

If  one  of  these  series  is  interrupted  at  any  point  the  error  thereby  incurred  is 
srgaller  than  the  first  disregarded  term. 

With  these  series  it  is  possible  to  compute  the  sine  and  cosine  of  any  given 
angle.  They  were  used  to  draw  up  the  sine  and  cosine  tables  found  in  logarithmic 
handbooks. 

In  order  to  illustrate  the  degree  of  approximation  let  us  compute,  for  example, 
the  sin  1°  =  sin  x  (where  x  =  tt/1  80).  We  set 


sin  1° 


=  sin  x  =  x  — 


x3 
'  ■ » 

6 


The  error  thereby  incurred  is  smaller  than  x5/120,  and  this  fraction  is  smaller 
than  0.000  000  000  02,  so  that,  calculated  exactly  to  10  places,  sin  1°  = 
0.0174524064. 

Note  1 .  Summation  of  the  series 


S  =  sin  a  +  sin  (a  +  8)  +  sin  (a  +  25)  +  •  •  •  +  sin  (a  +  n  —  18). 

We  multiply  both  sides  by  2  sin  3/2  and  transform  each  of  the  products  on  the 
right  in  accordance  with  the  formula 


„  .  8  .  ,  /  2v  -  1 

2  sin  ^  sin  (a  +  v5)  =  cos  I  a  +  — ^ 


We  are  then  left  with 


25  sin  |  =  cos  - 


cos 


8) 


—  cos 


Since  the  right  side  of  this  equation  is 


2  sin 


n-  1\  .  2 

2  /  Sm  n  V 


we  obtain 


S  =  sin  m  • 


.  8 
sm«  j 

~~T 

s*n  2 


where  m  =  a  +  ”  -  8  represents  the  mean  value  of  all  n  angles 

a  +  8,,.,,a  +  n-  18. 

In  order  to  obtain  the  sum  of  the  series 


2  =  cos  o  +  cos  (a  +  8)  +  •  •  •  +  cos  (a  +  n  —  IS) 

we  again  multiply  both  sides  by  2  sin  but  on  the  right-hand  side  we  write 


8  / 

2  sin  -  cos  (a  +  v8)  =  sin  la  + 

We  are  then  left  with 

.  8 


l  ,  +  1  . 

\  ■  / 

(“  +  2  s 

)-s,nl 

+  *^«V 


and  we  obtain 


.  8  .  / 

22, -sin  -  *  sin  la  + 


/  .  2ti  -  1 

\  .  /  8\ 

“  +  — o —  8 

\  2 

/  \  2/ 

n  /  ,  n  -  1  \  .  8 

=  2  cos  la  H - ^ —  81  sin  w  -* 


2  =  cos  771 


.  8 
n  sm  g 

~ T* 
sin  ^ 


Note  2.  Proof  that  lim  ti  sin  -  =  w. 

n-»  oo  7? 


C-  n  ■  w  u>  W  -W  n  W  /,  .  »t|A 

Sin  U)  =  2  sm  -  cos  -  =  2  tan  -  cos2  ^  “  2  tan  -•  11  —  sin2  - 1- 


However,  since  sin  w  <  w  and  tan  w  >  w,  it  follows  that 


sin  w  >  2  • 


or  sin  w  >  w  —  T  w 3. 

4 


Then  sin  -  lies  between  -  and  -  -  I  i.e.,  «  sin  -  lies  between  w  and  u;  -  I 

h  «  n  4  «3  «  4  «2 

Thus, 


lim  n  sin  —  =  w. 
n—  ®  W 
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Andre’s  Derivation  of  the  Secant  and  Tangent  Series 


Perhaps  the  most  convenient  and  certainly  the  most  attractive  way  of  deriving 
the  exponential  series  of  the  functions  sec  v  and  tan  v  is  the  method  of  zigzag 
permutations  devised  by  the  French  mathematician  Andre  ( Comptes  Rendus, 
1879,  and  Journal  de  Mathematiques,  1881). 

A  zigzag  permutation — called  by  Andre  an  “alternating  permutation” — of  the 
n  numbers  1,  2,  3,  ...,  n  is  an  arrangement  q,  c2,  ...,  cn  of  these  numbers  in 
which  no  element  cv  possesses  a  magnitude  such  that  it  lies  between  its  two 
neighbors  cv  _  x  and  cv  +  l.  If  the  points  P\,P2,  •  •  •,  Pn  are  marked  off  on  a  system 
of  coordinates  such  that  their  respective  abscissas  are  1,  2,  ...,  n  and  their 
respective  ordinates  cl,  c2,  cn,  and  each  two  successive  points  Pv  and  Pv  +  x 
are  connected  by  a  line  segment,  the  zigzag  line  by  which  the  permutation  gets 
its  name  is  obtained. 


FIG.  2. 

A  zigzag  line  or  zigzag  permutation  can  begin  either  by  rising  or  falling.  We 
assert: 


There  are  as  many  zigzag  permutations  ( among  n  elements )  that  begin  by 
rising  as  by  falling. 

Proof.  Let  PyPj  •••  P„  be  the  zigzag  line  corresponding  to  one  zigzag 
permutation.  Let  us  draw,  through  their  highest  and  lowest  point,  parallels  to  the 
abscissa  axis  and  a  parallel  midway  between  them.  If  we  construct  a  mirror 
image  of  the  zigzag  line  upon  the  middle  parallel,  the  mirror  image  gives  us  a 
new  zigzag  line  Q\Q2  •••  Qn  or  zigzag  permutation,  which  begins  either  by 
falling  or  rising,  depending  upon  whether  the  first  zigzag  line  begins  by  rising  or 
falling.  Thus,  for  every  zigzag  permutation  which  begins  by  rising  (or  falling) 
we  can  obtain  a  corresponding  zigzag  permutation  which  begins  by  falling  (or 
rising).  Consequently,  there  is  an  equal  number  of  each  type. 

Naturally  there  are  just  as  many  zigzag  permutations  that  end  by  rising  as  by 
falling. 

Let  us,  therefore,  designate  the  number  of  zigzag  permutations  of  n  elements 
as  2 An,  so  that  An  represents  the  number  of  zigzag  permutations  of  n  elements 
that  begin  (or  end)  by  rising  (or  falling). 

The  number  An  can  be  determined  by  a  periodic  formula.  Let  us  consider  all 
the  2 An  zigzag  permutations  of  the  n  elements  1,2,  ...,/?  as  written  down  and  let 
us  single  out  one  of  them,  in  which  the  highest  element  n  occupies  the  (r  +  l)th 
place  (counting  from  the  left).  To  the  left  of  n  there  are  then  the  r  elements  cq,  a2, 

. . .,  an  while  to  the  right  of  n  there  are  the  .s'  numbers  /L,  . . . ,  /iv,  with  r  +  s  =  m 
=  n  -  1 .  The  permutation  oqa2  . . .  ar  ends  by  falling,  since  ar  is  followed  by  n, 
which  is  higher;  the  permutation  •■■■>  Ps  begins  by  rising,  since  follows  n, 
which  is  higher. 

Now  let  there  be  formed  from  the  r  elements  cq,  a2,  ...,  ar  a  total  of  Ar  zigzag 
permutations  with  falling  ends  and,  similarly,  from  the  .v  elements  /i2,  ...,  fs  a 
total  of  As  zigzag  permutations  with  rising  beginnings.  Consequently,  there  are 
Ar  ■  As  zigzag  permutations  of  n  elements  in  which  n  occupies  the  (r  +  1)  th 
position  and  in  which  to  the  left  of  n  there  are  r  elements  cq,  a2,  ...,  ar.  However, 
since  there  are  many  other  combinations  of  m  elements  to  the  rth  class  aside 
from  the  considered  combination  cq,  a2,  ...,  ar — as  is  commonly  known,  there 

are  a  total  of  Cmr  =  mr  =  m\!r\s\ — there  are  consequently  a  total  of 


pr  =  mrArAt 


(r  +  s  =  m) 


zigzag  permutations  of  n  elements  in  which  the  highest  element  ( n )  occupies  the 
(r  +  l)th  place.  It  is  also  easily  seen  that  this  formula  is  also  valid  for  the  indices 
r  =  0,  1 ,  2  if  one  sets  A0=A1=A2=  1 . 

In  order  to  obtain  all  the  possible  zigzag  permutations  we  must  obtain  the 
expression  pr  for  all  the  values  from  r  =  0  through  r  =  m  =  n  -  1  and  add  the 

resulting  products.  This  gives  us 


m  O.m 

2An  =  2  Pr  "  2  mrArA,- 

0  r 

In  order  to  simplify  this  formula  somewhat  further,  we  write  m!/rlsl  instead  of 
mr  and  set 


It  is  then  transformed  into 


2nan  =  +  •••  + 

or,  utilizing  the  symbol  for  the  sum,  into 
(2)  2  nan  =  Zara,, 

where  r  and  s  pass  through  all  the  possible  integral  numbers  -  0,  for  which  r  +  s 
=  n  —  1. 

Using  the  periodic  formula  (2)  it  is  possible  to  compute,  beginning  with  a2, 
each  number  of  the  series  a0,  ax,  a2,  a3,  a4,  ...  from  the  numbers  preceding  it. 

From  an,  when  it  is  multiplied  by  n!,  it  is  possible  to  obtain  half  the  number  oj 
zigzag  permutations  of  n  elements. 

We  can  draw  up  a  table  for  the  simplest  cases: 


n  ■ 

0 

1 

2 

3 

4 

5 

6 

7 

8 

a.  - 

1 

1 

i 

1 

A 

A 

tW 

iVr 

tWt 

An  m 

1 

1 

1 

2 

5 

16 

61 

272 

1385 

We  are  able  to  confirm,  for  example,  that  the  four  elements  1,  2,  3,  4  yield  2  •  A4 
=  10  zigzag  permutations 


1324,  2143,  3142,  4132, 

1423,  2314,  3241,  4231, 

2413,  3412. 

It  is  but  a  short  step  from  the  zigzag  permutations  to  the  series  for  sec  x  and 
tan  x. 

First  we  establish  that  starting  with  the  index  3  all  av  are  proper  fractions  <  V. 
Since  the  number  of  zigzag  permutations  of  n  elements  for  n  >  2  is  smaller  than 
the  number  of  all  the  permutations  of  n  elements,  then  2 An  must  be  <n\,  and 

consequently, 


<*n  < 


i- 


Therefore,  the  infinite  series 


y  =  a0  +  +  a^x2  +  a^x3  +  •  •  • 

converges  absolutely  and  is  uniform  over  every  interval  -  h  through  +  h  where  h 
<  1.  It  therefore  represents  over  this  interval  a  continuous  function  with 
differentiable  terms.  The  derivative  ofy  is 

y'  =  <*!  +  2<Ja*  +  Zatf2  +  •  • 

Since,  moreover,  the  series  for  y  converges  absolutely,  we  can  square  it  and 
thereby  obtain 


y’-lV-h 

» 


where  bl  =  1  and  for  all  n  £  2 

bn  =  <i0(in  _  j  +  +  o2oH-  3  +  •••  + 

In  accordance  with  (2),  therefore,  whenever  n  £  2, 


bn  -  2 naB, 


and  then 


y2  =  1  +  2  •  2^2*  +  2  •  Sa^x2  +  2  •  4  a4x3  +  •  •  • . 
If  we  then  add  one  to  both  sides  we  obtain 


1  +  y2  =  2[fl!  +  202*  +  3a3x2  +  4a4x3  +  •  •  •] 

or 

1  +  ya  =  2y'. 

We  write  this  equation 


and  reflect  that  the  left  side  is  the  derivative  of  the  function 

Y  =  arc  tan  y  — 

but  that  the  derivative  of  a  function  (7)  can  be  zero  only  if  this  function  is  a 
constant.  Thus  we  have 


Y  =  arc  tan  y  —  =  const. 

In  order  to  determine  the  constant,  we  set  v  equal  to  zero  and  obtain  for  this 
value  of  the  argument  v 

y  =  1,  arc  tan  y  —  and  Y  *■  ~ 

The  constant  therefore  has  the  value  tt/4,  and  our  equation  is  transformed  into 

IT  X 

arc  tan  y  =  4+2 

From  this  it  follows  that 


and  we  have  the  progression 


(3)  tan  ^  =  a0  +  axx  +  a&%  +  a^x c3  +  •  •  • 

which  is  true  in  any  case  for  every  proper  fractional  positive  or  negative  value  of 
x. 

We  replace  x  in  (3)  by  -  x  and  obtain 

(4)  tan  -  jjj  =  a0  “  aix  +  aa*a  -  *3*?  + - • 

As  is  easily  seen,  however,  the  two  trigonometric  formulas 

2  sec  x  =  tan  (^  +  ^)  +  ‘an  ^  -  0 

and 

2  tan  x  =  tan  ^  -  tan  ^  -  0 


are  true. 

If  we  introduce  on  the  right-hand  side  here  the  series  indicated  in  and  (4)  we 
obtain  the  progressions  for  sec  x  and  tan  x  which  we  were  seeking: 

sec  x  =  a0  +  a^x2  +  a4x4  +  asx6  +  •  •  • , 
tan  x  =  a^x  +  +  a6x5  +  a^x1  +  •  •  • 

or,  if  we  return  to  half  the  number  of  zigzag  permutations,  An, 

X*  X *  X ® 

sec  x  «*  A0  +  A2  +  A4  —y  +  Ae  +  *  *  *» 

X ®  X5  .  X1 

tan  x  =  Axx  +  Aa  +  As  —  +  A7  —  + 

These  two  progressions  are  true  in  all  cases  for  every  proper  fractional  value  of 
x. 

However,  since  sec  x  and  tan  x  as  functions  of  the  complex  argument  x  are 
analytic  functions  of  x  and  the  individual  position  closest  to  zero  is  x  =  n!2,  the 


convergence  circle  has  the  radius  nil. 

The  two  exponential  series  for  sec  x  and  tan  x  consequently  converge  for 
every  x  the  absolute  value  of  which  lies  below  n!2. 


HQ|n  Gregory’s  Arc  Tangent  Series 

Determine  the  angles  of  a  triangle  from  the  sides  without  the  use  of  tables. 

If  a,  b,  c  are  the  given  sides  of  the  triangle,  a,  f,  y  the  angles  (given  in  arc 
measure),  the  following  relations,  as  is  well  known,  are  obtained: 


fl  p 

tan  5  =  -• 
2  v 


where  p 2  =  uvw/s,  u  =  s  -  a,  v  =  s  -  b,  w  =  s  -  c,  2s  =  a  +  b  +  c.  Thus,  a/2,  /i/2, 
y/2  are  the  arcs  whose  tangents  are  p/u,  p/v,  p/w.  We  write 


a  p  8  p  y  p 

„  —  arc  tan  ->  ^  «*  arc  tan  -»  £  =  arc  tan  — • 

2  «  2  v  2  w 

Arc  tan  x  is  understood  to  represent  the  arc  whose  tangent  is  x.  The  function  arc 
tan  x  is  called  a  cyclometric  function. 

We  can  consider  our  problem  solved  if  we  can  succeed  in  calculating  the 
cyclometric  function  arc  tan  x  for  any  given  x.  This  can  be  calculated  by  means 
of  the  exponential  series  for  the  arc  tangent  function  obtained  in  1671  by  the 
English  mathematician  James  Gregory  (1638-1675). 

To  derive  the  arc  tangent  series  we  make  use  of  the  mean  value  of  the 
function  /(*)  =  - — — -,  which  we  must  consequently  compute  beforehand. 

1  X 

On  a  tangent  of  a  unit  circle  sv  we  mark  off  from  the  point  of  tangency  A  the 
two  segments  Ap  =  v  and  AP  =  V  in  such  a  manner  that  Pp  =  cp  =  V  -  v;  we 
connect  p  and  P  with  the  center  of  the  circle  O  and  designate  the  distances  Op 
and  OP  as  r  and  R,  their  intersections  with  jt  as  q  and  Q,  and  the  arcs  Aq,  AQ,  qQ 
in  that  order  as  w,  W,  co.  This  gives  us  the  equations  w  =  arc  tan  v,  W=  arc  tan  V, 
co  =  arc  tan  V- arc  tan  v. 

We  would  like  to  divide  the  area  (i<p)  of  the  triangle  OPp  into  two  sections 
and  for  this  purpose  we  draw  the  two  arcs  ph  and  PH  concentric  to  qQ  so  that 
they  meet  OP  and  the  extension  of  Op  at  h  and  H.  The  area  of  the  triangle  is  then 
greater  than  the  area  (h2co)  of  the  sector  Oph  but  smaller  than  the  area  QR2co)  of 


the  sector  OPH,  so  that 


It  follows  from  this  that 


r2 w  <  <p  <  R2w. 


_1_  a,  _1_ 

R2<  <p<  r2 


or,  if  instead  of  cp,  co,  r2,  and  R2  we  write  in  the  same  order  V-v,  arc  tan  V  -  arc 
tan  v,  1  +  v2,  1  +  V 2  (Pythagoras), 


(1) 


1  ^  arc  tan  V  —  arc  tan  o  ^  1 

1  +  V2  <  V~^i  <  TT^5 


In  order  to  determine  the  mean  value  of  the  function  F(x) 
interval  0  through  x,  i.e.,  the  limiting  value  of 

F{8)  +F(  28)  +  +  F(n8) 


\  +  x 2 


over  the 


(where  S  =  x/n ),  in  (1)  we  substitute  successively  0|<5,  S\2S,  2S\3S,  ...,(«-  \)8\n8 
for  the  value  pair  v\  V,  add  the  resulting  inequalities,  and  obtain 


nn  < 


arc  tan  x 
8 


<  nn  +  l  - 


l 

l  +  *a 


or 


arc  tan  x  x2  arc  tan  * 

x  n(l  +  x2)  <  M  <  x 

As  the  limit  n  =  oo  is  approached  this  inequality  is  transformed  into 


(2) 


*  1  arc  tan  x 

o  TTV2  "  x 

Now  for  the  derivation  of  the  arc  tangent  series! 
It  is 


1 

1  +  x2 


1  - 


X 


a 


1  +  r» 


or 

F  =  \  -  x2F, 

if  for  the  sake  of  brevity  we  write  F  for  F(x).  If  we  replace  the  F  on  the  right- 
hand  side  of  this  equation  with  1  -  x2F,  we  obtain 

F  =  1  -  *a  +  x*F . 

If  here  we  once  again  write  1  -  x2F  for  F  on  the  right-hand  side,  we  obtain 

F  =  \  -  x2  +  x*  -  x*F. 


In  a  similar  manner,  from  this  we  obtain 


F  -  1  —  x2  +  x4  -  x*  +  x*F, 

F  =  1  —  x2  +  x*  —  x*  +  x*  —  xl0F, 


etc.  Consequently,  we  obtain  the  inequality 


1  —  X2  +  X*  —  X6  + 


,4»-2 


<F 


<  1  -  x3  +  X*  -  X9  + - hx 


4fl 


Obtaining  the  mean  value  here  gives  us 


X2  X*  X6  x4n~2 

I - -u - 4-_ ... - 

3  5  7  4/i-l 


arc  tan  x 


x '  x6 


~4n 


<  1  —  -5-  +  -?-  —  -=■+  —  •••  +  t — ; — 7 

3  5  7  4«+l 


or 


X3  Xs  X1  X4*"1 

(3)  Jt_3  +  5_T+ - arri 


<  arc  tan  x 


<x~3+J~J+~ 


rin  + 1 


4  n  +  1 


If  we  then  set 


Jg3  jy5  jy4ll  “  X 

arctanx  =  x  —  -5-  +  -=-  —  -=-H - -  -  -z - r 

3  5  7  4n  —  1 


or  rather 


X3  X6  X7 

arctanx  =  .x-y  +  —  -  y+ - 


x4n  - 1  x*n  + 1 

4n  —  1  4n  +  1* 


the  error  thereby  incurred  is  smaller  than  the  difference  x4n  +  V(4 n  +  1)  of  the 
boundaries  of  (3).  Since,  however,  this  difference  tends  toward  zero  when  n 
becomes  infinitely  great  and  x  is  a,  proper  fraction  (also  when  x=  1),  we  obtain 
the  progression 

(4)  arc  tan  *  =  *- j  + j-y  t -  (for  x  ^  1). 


This  is  Gregory  s  formula.  If  the  progression  is  interrupted  at  any  point  the  error 
incurred  is  smaller  than  the  first  disregarded  term. 

The  series  cannot  be  used  when  x  is  an  improper  fraction,  because  it  no 
longer  converges.  In  order  to  calculate  arc  tan  x  in  this  case  we  introduce  y  =  1/x, 
the  reciprocal  value  of  x,  and  make  use  of  the  formula 


(5) 


arc  tan  x  +  arc  tan  y  = 


IT 

2 


[If  arc  tan  x  =  a,  i.e.,  x  =  tan  a,  then  from 

In  \  1 

tan  ~  —  a  I  =  -  =  y 

\2  /  tan  a  * 


we  obtain  by  inversion 

IT  TT  _ 

-  —  a  *  arc  tan  y  or  -  =  arc  tan  x  +  arc  tan  y.J 

We  then  obtain  arc  tan  y  in  accordance  with  Gregory’s  formula  and  arc  tan  x  in 
accordance  with  (5). 

But  even  if  x  is  a  proper  fraction  the  arc  tangent  series  is  not  advisable  when  x 
is  very  close  to  1 .  In  this  case  we  introduce  z  =  ^  the  half  reciprocal  value  x, 

and  make  use  of  the  formula 


(6) 


arc  tan  x  +  arc  tan  z  —  T» 

4 


[If  arc  tan  x  =  a,  i.e.,  x  =  tan  a,  then  from 


tan 


1  —  tan  a 
1  +  tan  a 


we  obtain  by  inversion 


IT 

4 


1  —  X  ir  . 

a  =  arc  tan  n -  or  -  =  arc  tan  *  +  arc  tan  z.\ 

1  +  x  4  J 


Thus  we  obtain  arc  tan  z  with  Gregory’s  formula  and  then  arc  tan  x  with  (6). 
Note.  If  in  (4)  we  set  x  =  1,  we  obtain  the  so-called  Leibniz  series: 


which  was  discovered  by  Leibniz  independently  of  Gregory  in  1674. 

It  is  not  advisable,  however,  to  use  this  series  to  calculate  n.  The  series 
discovered  by  the  English  mathematician  John  Machin  (|  1751),  which  was 
published  by  him  in  1706,  is  much  better  suited  for  this  purpose.  Machin  made 
use  of  the  auxiliary  angle  2  whose  tangent  is  j.  From  tan  A  =  j  it  follows  that  tan 
27.  =  2  tan  2/(1  -  tan2  X)  =  -p2-  and  from  this,  similarly,  that 

120 

tan  4A  =  2  tan  2A/(1  —  tan3  2A)  =  yyg 


Inversion  gives  us  42  =  arc  tan  f or 

120  .  1 

arc  tan  =  4  arc  tan  ■=• 

1  19  5 

The  left  side  of  this  equation,  according  to  (5),  has  the  value  ^  -  arc  tan  arc 
tan  however,  according  to  (6),  has  the  value  ^  -  arc  tan  jh,  so  that  the  left 
side  is  ^  +  arc  tan  jh-  Consequently, 

n  .  1  1 

-  =  4  arc  tan  ^  —  arc  tan 

or  written  out  completely: 

TT  /I  1  1  \ 

4  “  *\5  3-53  +  5  5s  +'*7 

/I  1.1  \ 

\239  3 -239s  +  5-239#  +**7* 


Using  this  series,  Machin  calculated  n  to  100  decimal  places. 
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Buffon’s  Needle  Problem 


On  a  table  at  d  intervals  parallels  are  drawn.  A  needle  of  length  1  smaller 
than  d  is  thrown  at  random  on  the  table.  What  is  the  probability  that  the  needle 
will  touch  one  of  the  parallels? 

This  remarkable  problem  stems  from  Georges  Louis  Leclerc,  Comte  de 
Buffon  (1707-1788),  who  was  the  first  man  to  clothe  probability  problems  in 
geometric  form. 

The  probability  of  an  event  is  commonly  understood  to  mean  the  ratio  of  the 
number  of  cases  favoring  an  event  to  the  total  number  of  possible  cases. 

Let  the  probability  we  are  seeking  be  W. 

Let  the  needle  have  the  terminal  points  A  and  B.  Let  us  imagine  the  parallels 
extended  horizontally.  Let  us  single  out  two  such  adjacent  parallels  I  and  II 
(below  I)  and  from  any  point  P  on  line  1  let  us  drop  a  perpendicular  PQ  (=  d)  to 
line  11. 

Let  us  begin  by  considering  the  special  positions  2.  of  the  needle  which  are 
characterized  by  the  following  three  conditions:  (1)  the  terminal  point  A  lies  on 
the  segment  PQ;  (2)  the  needle  lies  to  the  right  of  QP;  (3)  AP  forms  an  acute 
angle:  the  inclination  of  the  needle  toward  QP. 

Let  the  probability  that  the  needle  touches  parallel  1  in  any  of  the  special 
positions  be  w. 

First  we  will  show  that 


W  =  w. 

If  we  consider  all  of  the  positions  s'  in  which  the  needle  touches  with  its 
terminal  point  A  either  end  of  the  segment  PQ  but  is  otherwise  arbitrarily 
situated  (i.e.,  touching  either  1  or  II  or  neither)  this  quadruples  (as  compared  to 
the  number  of  positions  Q)  both  the  number  of  all  the  possible  cases  and  the 
number  of  all  the  favorable  cases. 

The  probability  of  touching  one  of  the  two  parallels  I  and  II  in  all  of  the 
positions  fi'  is,  therefore,  likewise  w. 

If  to  the  cases  2r  we  add  those  positions  in  which  the  terminal  point  B  instead 
of  terminal  point  A  comes  to  rest  on  the  segment  PQ,  we  obtain  a  total  of  q" 
positions,  which  doubles  the  number  of  possible  cases  as  well  as  the  number  of 
favorable  cases. 

Consequently,  the  probability  of  touching  one  of  the  parallels  I  and  II  in  the 
positions  s"  is  also  w. 


Now  if  instead  of  taking  one  perpendicular  PQ  we  take  a  very  great  number 
— v — of  very  closely  situated  equidistant  successive  perpendiculars  between  1 
and  11  and  consider  all  the  positions  of  the  needle  in  which  one  end  of  the  needle 
comes  to  rest  upon  one  of  these  v  perpendiculars,  we  thereby  multiply  by  v  (with 
respect  to  £")  the  number  of  all  the  possible  as  well  as  that  of  all  the  favorable 
cases. 

Consequently,  the  probability  of  touching  one  of  the  parallels  1  and  11  by  a 
needle  position  in  which  one  needle  end  lies  between  1  and  11  is  again  w. 

The  addition  of  still  a  third  parallel  111  representing  a  mirror  image  of  1  on  11 
(or  of  11  on  I),  as  well  as  the  addition  of  the  needle  positions  in  which  one  end  of 
the  needle  lies  between  111  and  11  (or  between  111  and  I),  again  give  us  a 
probability  of  w. 

In  short,  we  have  shown  that 


W  —  w. 

Consequently,  our  problem  has  been  limited  to  the  task  of  determining  the 
probability  w  of  the  needle  touching  line  I  in  a  special  position 


Q 

FIG.  3. 


To  obtain  a  better  view  of  the  infinitely  great  number  of  special  positions,  let 
us  divide  the  above  segment  PQ  into  a  very  great  number — N  >  lOOO1000— of 
equal  parts  and  let  us  consider  all  of  the  cases  in  which  the  needle  end  A  cuts  one 
of  the  dividing  points.  For  each  dividing  point  there  are  an  infinitely  great 
number  of  possibilities  corresponding  to  the  infinitely  great  number  of  possible 
needle  angles.  For  convenience  in  considering  these  possibilities  also,  let  us 
consider  only  the  M  angles 


00  —  0i  —  c>  02  —  2e,  03  —  3c, . . &m- i  —  (Af  —  l)e, 

~  73 

where  M likewise  represents  a  very  great  number  (e.g.,  M>  2  )  and  s  is  the  A/th 

part  of  7r/2. 

In  this  manner  our  consideration  involves  N  points  and  M  angles,  thus,  a  total 
of  NM needle  positions. 

However,  only  a  certain  fraction — just  w — of  these  positions  are  favorable.  In 
order  to  determine  this  fraction  we  begin  by  obtaining  the  total  number  of  only 
those  favorable  positions  in  which  the  angle  of  inclination  of  the  needle  has  the 
selected  value  0S  as  illustrated  in  Figure  3.  These  positions  form  a  parallelogram 

EFGP  with  the  sides  EF  =  /  and  EP  =  /  cos  6S.  Since  there  are 


ep  i 
N  TQ  =  tt'd™6' 


dividing  points  on  the  segment  EP,  our  overall  total  comprises 

N  ^  cos  0a 
d 

favorable  positions  (with  the  common  needle  angle  6S).  The  number  n  of  all  the 
favorable  positions  altogether  is  consequently 

n  =  N (cos  0O  +  cos  0i  +  cos  02  +  •  •  ■  +  cos  0M-i)> 

The  probability  that  we  are  seeking  is,  therefore, 

n  l  cos  0O  4-  cos  +  cos  02  +  •  •  •  +  cos  0U-i 
W  ~  NM  ~  d  M 

There  remains  then  only  the  task  of  determining  the  value  of  the  fraction 

cos  0O  +  cos  0j  +  cos  02  +  •  •  •  +  cos  0M  _  j 

m  =  - M - 

The  fraction  m  is  no  different  from  the  mean  value  of  the  cosine  function  over 
the  interval  0  through  nil. 

Those  who  are  familiar  with  the  elements  of  integral  calculus  will 
immediately  be  able  to  write  this  mean  value;  it  is 

=  J>WH- 


m 


Those  readers  who  are  not  familiar  with  this  type  of  calculation  can  obtain  m 
just  as  easily  in  the  following  adroit  manner. 

Draw  a  quadrant  of  a  circle  with  a  radius  of  1 ,  designating  the  horizontal  arm 
as  OH  and  the  vertical  as  OK.  If  this  is  rotated  about  the  radius  OK  it  forms  a 
hemisphere  the  area  of  whose  surface  is  commonly  known  to  be  2n. 

The  area  of  this  surface  can  be  expressed  in  a  different  form. 

For  this  purpose  let  us  move  the  above  angles  of  inclination  80,  8h  02,  . . .,  0M 
_  1  so  that  the  angles  are  formed  at  O  with  OH.  The  resulting  free  arms  divide  the 
quadrant  into  M  very  small  arcs  with  the  common  length  e.  Let  us  select  from 
among  them  the  one  lying  between  the  free  arms  of  the  angles  8S  and  0s  +  l.  On 
being  rotated  it  forms  a  very  small  spherical  zone,  which  when  flattened  out  to  a 
strip  possesses  the  length  In  cos  8S  and  the  height  e,  so  that  the  area  is  then  2ke 
cos  8S. 

Since  the  sum  of  all  the  spherical  zones  obtained  in  this  manner  gives  the 
hemisphere,  we  obtain  the  equation 

2»re(cas  0Q  +  cos  0i  +  cos  02  +  •  •  •  +  cos  0M-i)  =  2*r 

or,  since  Me  =  k!2, 


COS  e0  +  cos  0!  +  cos  02  +  •  •  •  +  cos  0M  _  j  2 

M  “  7 

Thus,  we  have  obtained  the  mean  value  that  we  were  seeking. 

The  mean  value  of  the  cosine  function  {naturally  that  of  the  sine  function 
also)  over  the  interval  0  through  k!2  is  2ln. 

[This  also  follows  from  formulas  (1)  and  (2)  of  No.  15.] 

At  the  same  time  we  obtain 


l  l  2 

w  ~  dm  d  it 


or 


This  formula  gives  us  the  probability  we  were  seeking. 

Note.  Wolf  in  Zurich  (1850)  arrived  at  the  original  idea  of  using  the 
obtained  formula  to  calculate  the  number  n.  Experimentally,  by  a  great  number 


(5000)  of  throws  with  a  needle  36  mm  long  and  a  distance  of  45  mm  between  the 
parallels,  he  found  the  probability  W  to  be  (approximately)  0.5064,  and  obtained 


n 


21 

dW 


3.1596. 


The  Englishmen  Smith  (1855)  and  Fox  (1864)  repeated  the  experiment  and 
found  with  3200  and  1100  throws,  respectively,  values  of  3.1553  and  3.1419  for 

71. 
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The  Fermat-Euler  Prime  Number  Theorem 


Every  prime  number  of  the  form  4n  +  1  can  be  represented  in  only  one 
manner  as  the  sum  of  two  squares. 

This  famous  theorem  was  discovered  about  1660  by  Pierre  de  Fermat  (160 1— 
1665),  the  greatest  French  mathematician  of  the  seventeenth  century.  It  was  not 
published,  however,  until  1670,  when  it  appeared,  unfortunately  without  proof, 
in  the  notes  to  the  works  of  Diophantus,  edited  by  Fermat’s  son.  It  is  not  certain 
whether  or  not  Fermat  had  obtained  the  proof. 

The  first  proof  of  the  theorem  was  presented  almost  100  years  later  by 
Leonhard  Euler  in  his  treatise  “Demonstratio  theorematis  Fermatiani,  omnem 
numerum  primum  formae  4n  +  1  esse  summam  duorum  quadratorum”  (Novi 
Commentarii  Academiae  Petropolitanae  ad  annos  1754-1755,  vol.  V),  after 
years  of  fruitless  attempts  at  its  solution. 

Today  there  are  several  proofs  of  the  Fermat-Euler  theorem.  The  following 
proof  is  distinguished  by  its  great  simplicity. 

For  the  reader  who  is  unfamiliar  with  problems  of  number  theory  we  will 
provide  several  explanations  that  will  be  necessary  for  understanding  this  proof 
and  will  also  be  found  useful  for  the  problem  dealt  with  in  No.  22.  At  the  same 
time,  it  is  to  be  understood  that  the  letters  used  here  and  in  No.  22  represent 
whole  numbers. 

Two  numbers  a  and  b  (according  to  Gauss),  are  called  congruent  to  the 
modulus  m. 


Written:  a  =  b  mod  m,  read  a  congruent  to  b  modulo  m, 

when  their  difference  is  divisible  by  m.  Every  number,  for  example,  in  regard  to 
the  modulus  (to  the  modulus,  modulo )  m,  is  congruent  to  the  residue  it  leaves 
over  when  divided  by  m,  for  example  65  =  2  mod  7.  And  this  is  also  true  when 
the  word  residue  is  taken  in  its  most  general  sense,  in  which  it  means  the  residue 
left  after  division  when  the  quotient  is  arbitrarily  chosen.  If,  for  instance,  we 
write  65/7  =  12,  we  remain  with  a  residue  of -19. 

Among  the  many  possible  residues  two  are  of  special  importance:  the 
conventional  or  common  residue,  which  is  positive  and  smaller  than  the  divisor, 
and  the  minimal  residue,  the  magnitude  of  which  never  exceeds  half  the  divisor. 
A  minimal  residue  of  the  division  89/13  is,  for  example,  -2,  because  89/13  =  7  - 


-fc,  which  can  also  be  written  89  =  -  2  mod  13. 

The  following  self-evident  rules  apply  to  congruences  to  the  same  modulus: 

1 .  If  two  numbers  are  congruent  to  a  third ,  they  are  also  congruent  to  each 
other. 

2.  Two  congruences  can  be  added ,  subtracted ,  and  multiplied. 

From 


A  =  B  mod  m,  a  =  b  mod  m 


it  follows  that 


A  ±  a  s  B  ±  b  mod  m 


and 


Act  s  Bb  mod  m. 

[From  A  =  B  +  Gm  and  a  =  b  +  gm  it  follows,  for  example,  that  Aa  =  Bb  + 
gm  (g  integral),  i.e.,  Aa  =  Bb  mod  m .] 

3.  The  congruence 


a  =  b  mod  m 

may  be  multiplied  by  any  whole  number  g: 

ag  =  bg  mod  m. 

It  can  be  divided  by  g  only  when  g  is  a  common  divisor  of  a  and  b  that  has  no 
common  divisor  with  the  modulus.  If,  for  example,  we  divide  49  =  14  mod  5  by 
7,  we  obtain  a  correct  congruence  7  =  2  mod  5. 

A  system  of  m  integral  numbers  no  two  of  which  are  congruent  to  the 
modulus  m  is  called  a  complete  residue  system  to  the  modulus  m.  The  simplest 
complete  residue  system  is  the  system  of  the  m  common  residues  0,  1,2,  ...,m- 
1,  and  the  next  simplest  is  the  system  of  m  minimal  residues. 

Every  number  z  is  congruent  to  the  modulus  m  to  one  and  only  one  number  of 
a  complete  residue  system  mod  m. 

Of  particular  importance  is  the  following  theorem: 

Theorem:  If  the  numbers  of  a  complete  residue  system  are  multiplied  by  a 
number  possessing  no  common  divisor  with  the  modulus ,  there  is  obtained  once 
again  a  complete  residue  system  with  respect  to  the  modulus. 

Proof.  Let  m  be  the  modulus,  a  the  multiplier  possessing  no  common 


divisor  with  m.  If  then  for  two  different  numbers  x  and  x'  of  the  given  residue 
system  ax  =  ax'  mod  m  were  true,  it  would  follow  from  congruence  rule  3  that  x 
=  x'  mod  m,  which,  however,  is  not  the  case. 

From  this  theorem  it  follows  directly  that: 

The  congruence 


ax  =  b  mod  m, 

in  which  a  and  m  possess  no  common  divisor,  possesses  in  each  complete 
residue  system  mod  m  one  and  only  one  “ roof  x. 

Quadratic  Residues 

Of  two  numbers  possessing  no  common  divisor  one  is  called  the  quadratic 
residue  of  the  other  when  it  is  congruent  to  a  square  number  with  respect  to  the 
other  as  modulus;  if  there  is  no  such  square  number  it  is  called  a  quadratic 
nonresidue.  For  example,  12  is  a  quadratic  residue  of  13,  since  12  =  82  mod  13; 
-  1  is  a  quadratic  nonresidue  of  3,  since  there  exists  no  square  number  x2  such 
thatx2  =  -  1  mod  3. 

The  following  theorems  concerning  quadratic  residues  and  nonresidues  apply 
to  odd  prime  number  modulus  p  : 

1.  There  are  a  total  of  P  =  (p  -  l)/2  mutually  incongruent  quadratic  residues 
and  just  as  many  mutually  incongruent  nonresidues  of  p.  The  former  are  l2,  22, 
32,  . . .,  P2,  or  whichever  numbers  are  congruent  to  them  mod  p. 

II.  The  product  of  two  residues  is  a  residue ,  the  product  of  a  residue  and  a 
nonresidue  is  a  nonresidue ,  and  finally,  the  product  of  two  nonresidues  is  a 
residue. 

Proof  of  I.  1 .  If  two  of  the  designated  squares  were  congruent  to  each  other, 
for  example  x2  =  y1  mod  p,  the  product  (x  +  y)  (x  -y)  [which  is  equal  to  x2  -  y2] 
would  be  divisible  by  p,  which  is  impossible,  because  both  of  its  factors  are 
smaller  than  p. 

2.  If  we  continue  the  series  of  squares  beyond  P2,  no  new  residues  are 
obtained.  The  square  (P  +  h)2,  for  example,  is  congruent  to  k 2  mod  p  if  k  ^  P  is  so 
determined  that  P  +  h  +  k  is  divisible  by  p,  since  then  P  +  /z  =  and  moreover  (P 
+  h)2  =  k2  mod  p.  Since  there  are  (aside  from  the  number  divisible  by  p, 
disregarded  here)  2P  numbers  mutually  incongruent  mod  p,  there  must  be  a  total 
of  P  mutually  incongruent  quadratic  nonresidues  of  p. 

Proof  of  II.  Let  R  and  r  be  quadratic  residues,  N  and  n  quadratic 


nonresidues  of  p. 

1 .  From  A2  =  R,  a2  =  r  mod  p  we  obtain  by  multiplication  ( Aa )2  =  Rr  mod  p. 
Consequently,  Rr  is  a  residue. 

2.  The  2P  numbers  l2,  22,  P2,  M2,  N22,  .  .^TVh2  are  mutually  incongruent 

mod  p.  Since  the  first  P  of  these  numbers  are  quadratic  residues  of  p,  and  since 
only  P  residues  exist,  the  P  numbers  M2,  N22,  . . .,  NP 2  must  be  nonresidues,  i.e., 
NR  is  a  nonresidue. 

3.  The  2 P  numbers  n  •  l2,  n  ■  22,  n  ■  32,  ...,«•  P2,  n  •  M2,  «  •  7V22,  ...,«•  TVh2 
are  mutually  incongruent  mod  p.  The  first  of  these  numbers  are  nonresidues  in 
accordance  with  2.;  consequently,  the  others  must  be  residues  in  accordance  with 
1 .;  however,  among  them  is  the  product  of  the  two  nonresidues  N  and  n.  Q.E.D. 

Let  us  now  consider  the  bilinear  congruence 

(0)  xy  =  D  mod  pt 

in  which  the  modulus  p  is  once  again  an  odd  prime  number,  D  a  given  number 
possessing  no  common  divisor  with  p,  and  the  “mutually  conjugate”  or  “linked” 
magnitudes  x  and  y  are  chosen  in  such  a  manner  from  the  system  X  of  the 
numbers  1,  2,  3,  1  that  (0)  is  satisfied.  For  each  x  from  £  there  is  then 

only  one  conjugate  y.  [From  xy  =  D  mod p  and  xy'  =  D  mod p  it  follows  that  xy  = 
xy'  mod p  and  from  this  y  =  y'  mod p  or y  -y'  =  0  mod p.  However,  since  both  y 
andy'  £p  -  1,  their  difference  is  divisible  by p  only  wheny'  =  y.] 

We  select  xT  arbitrarily  from  X  and  determine  yl  such  that 


xxyx  =  D  mod  p. 


Then  we  select  from  X  a  number  x2  that  differs  from  xl  and  yx  and  determine  y2 
such  that 


*2^2  =  D  mod  p. 


y2  then  is  different  from  x1  as  well  as  from  y1 . 

We  continue  in  this  manner  until  all  the  numbers  of  X  have  been  arranged  in 
the  resulting  congruences. 

Here  there  are  two  cases  to  be  distinguished: 

1.  yv  never  equals  xv.  In  other  words:  the  congruence  x2v  =  D  mod  p  is 

impossible;  D  is  a  quadratic  nonresidue  of  p.  We  then  obtain  exactly  P  =  (p  - 
l)/2  pairs  xv,yv  of  conjugate  numbers,  and  multiplication  of  the  p  congruences 


formed  gives 

0) 


(p  —  1) !  =  Dp  mod  p. 


2.  For  a  certain  index  v,  yv  =  xv,  thus  x2v  =  D  mod p;  D  is  a  quadratic  residue 
of p.  If  aside  from  v  there  is  also  an  index  p  for  which  the  same  occurs,  then  x2fl 
=  D  mod p,  and  so  x2,  =  x2v  mod p,  i.e.,  x2^  -  x2v  or  x/L  +  xv)  (x^  -  xv)  is  divisible 
by  p.  Since  xjL  -  xv  is  not  divisible  by  p,  x/A  +  xv  must  be  divisible  by  p ,  and 
consequently  xfl  =  p  -  xv.  Actually,  then  x\  =  p2  -  2 pxv  +  x?  =  x*  =  D  mod  p.  Equal 
linked  magnitudes  thus  occur  exactly  twice  if  they  occur  at  all.  In  our  case  (yv  = 
xv, yfl  =  Xy)  we  now  have  only  P-\  congruences  =  D  mod p,  where ys  differs 
from  xr  To  these  P  -  1  congruences  we  add  the  congruence 

xvxu  =  —  D  mod  p, 

multiply  all  p  congruences  and  obtain 

(2)  {p  -  1)!  =  —D*  mod p. 

This  is  the  case  when,  for  example,  D  =  1,  since  then  1 2  =  D  mod  p.  Then  we 
have  the  congruence 

(2a)  (p  —  1) !  =  —1  mod  pt 


which  represents  the  so-called  Wilson  theorem. 

Using  Wilson’s  formula  we  write  instead  of  (1)  and  (2) 

(la)  /)'  =  —  1  (mod  p)  (2a)  IT  =  l(mod  p) 


and  obtain 

Euler’S  theorem  :  The  number  D  that  possesses  no  common  divisor  with 
the  prime  number  p  is  either  a  quadratic  residue  or  nonresidue  of  p,  depending 
on  whether  is  congruent  mod  p  to  the  positive  or  negative  unit. 

The  introduction  of  the  Legendre  symbol  makes  it  possible  to  express  this 
criterion  of  the  residue  character  of  a  number  by  a  formula.  The  Legendre 
symbol  ^  represents  the  positive  or  negative  unit,  depending  on  whether  or  not 

D  is  a  quadratic  residue  or  nonresidue  of p.  Thus,  for  example,  (jj  =  1,  since  32  - 


1,  since  there  is  no  square  number  whose 


2  is  divisible  by  7,  whereas 
difference  from  2  is  divisible  by  3. 


When  this  symbol  is  used  Euler’s  criterion  assumes  the  simple  form 


(3)  *  Ds  mod  p,  with  p  =  ^  1 


In  the  simple  case  D>-  1,  congruence  (3)  is  transformed  into  the  equation 


(_i)  =  (-!)<.-». 


(4) 


since  in  this  case  both  sides  of  (3)  are  units,  and  the  difference  between  two  units 
is  divisible  by  the  odd  prime  number  p  only  when  these  units  are  equal. 


Now  £ - I  is  even  or  odd.  denendinu  on  whether  the  nrime  number  n  is  of  the 


of p.  Consequently,  the  following  is  true: 

Theorem  of  Euler:  The  negative  unit  is  a  quadratic  residue  of  the  prime 
number  p,  when  p  has  the  form  4n  +  1  and  a  quadratic  nonresidue  when  p  has 
the  form  4n  +  3. 

In  other  words:  The  pure  quadratic  congruence 


xa  +  1  =  0  mod  p 


has  integral  solutions  x  when  p  has  the  form  4n  +  1  and  has  not  when  p  has  the 
form  4n  +  3. 

Now  for  the  proof  of  the  Fermat-Euler  theorem ! 

The  following  proof  is  based  upon  the  above  theorems  and  the 

Norm  theorem  :  If  a  prime  number  goes  into  a  norm  but  not  into  the  bases 
of  the  norm,  it  is  itself  a  norm. 

A  norm  is  understood  to  mean  the  sum  of  the  squares  of  two  whole  numbers, 
which  are  the  “bases”  of  the  norm. 

Proof  of  the  norm  theorem.  Let  the  prime  number  p  go  into  the  norm  a2  + 
b2,  but  not  into  its  bases  a  and  b,  so  that 


(5) 


a2 +  **=/>/, 


it  being  assumed  that  the  factor  /  is  greater  than  1  but  smaller  than  pi 2.  This 
assumption  does  not  represent  a  limitation  of  the  theorem,  since  from  A2  +  B2  = 
pF,  with  F  >  (PI 2),  we  can  immediately  form  the  equation  a2  +  b2  =  pf  with  /  < 
(P/2),  if  the  minimal  residues  A  -  hp  and  B  -  kp  of  the  divisions  A/p  and  B/p, 
respectively,  are  taken  for  a  and  b,  respectively.  On  the  one  hand, 

a2  +  b2  -  [A2  +  B2]  -  2 (Ah  +  Bk)p  +  ( h 2  +  k a)/>2 


is  divisible  by  p,  and  thus 


a2  +  b2  =  pf\ 

while  on  the  other  hand,  since  \a\  <  \p  and  \b\  <  \p,  a 2  +  b2  is  smaller  than  \p2  or 
Pf <  iP2  or  /  <  \p.  Moreover,  p  does  not  go  into  either  a  or  b,  because  then 
(contrary  to  our  assumption)  it  would  go  into  A  =  a  +  hp  or  into  B  =  b  +  kp. 

We  determine  the  minimal  residues  a  =  a  - mf  and  [/  =  b  -  nf  of  the  divisions 
a/f  and  b/f  and  obtain  similarly 

(6)  *2  +  l32=ff\  with  f$if. 

Multiplication  of  (5)  and  (6)  gives  us 

{a2  +  b2)  (a2  +  a2)  =  pfr 


or 


(aa  +  bp)2  +  (ap  -  bcc)2  =  Pf2/'. 


Since 


aa  +  bf$  =  [ a 2  +  £2]  —  ( am  +  bn)f  =  a’ft 
aft  +  ba  =  ( bm  —  an)f  =  bf 

the  equation  obtained  is  written 

(7)  a’2  +  b’2  =  pf,  where  f  ^  */. 

Here/  cannot  disappear.  Iff  =  0,  then  in  accordance  with  (6)  a  =  0  and  /?  =  0, 
and  from  this  it  follows  that  a  =  mf  and  b  =  nf,  then  according  to  (5)  p  =  ( m 2  + 
n2)f.  In  this  event  p  would  have  to  be  divisible  by  f  and  then  /  would  have  to 


equal  1,  which  contradicts  our  premise. 

If,  then,/  =  1,  (7)  already  gives  us  the  norm  expression  of p. 

If/  >  1,  we  obtain  from  (7) 

(8)  a"2  +  b"2  =  pf"  with  0<r  Stf, 

just  as  (7)  was  obtained  from  (5).  This  method  of  constructing  new  equations 
with  continuously  diminishing  factors /,/,/',  ...  is  continued  until  the  factor  1 
appears.  The  corresponding  equation  gives  the  prime  number  p  represented  as  a 
norm. 

Now  we  will  prove 

I.  A  prime  number  q  of  the  form  4n  +  3  cannot  be  represented  as  a  norm. 

II.  Every  prime  number  p  of  the  form  4n  +  1  can  be  represented  as  a  norm  in 
only  one  way. 

Proof  of  I.  If  it  were  true  that 


a2  +  b2  =  q , 


then  it  would  follow  that 


b2  s  —a2  mod  q 

and  the  product  (-1)  (a2)  of  a  quadratic  nonresidue  (-1)  and  a  residue  ( a 2)  of  q 
would  be  a  quadratic  residue  ( b 2)  of  q,  which  according  to  the  above  is 
impossible. 

Proof  of  II.  According  to  Euler’s  theorem  there  is  a  whole  number  x  such 
that  the  norm  x2  +  1  is  divisible  by  p.  According  to  the  norm  theorem,  p  is  then 
itself  a  norm: 


p  =  a2  +  b2. 

Here  also  there  is  only  one  possible  norm  representation. 

If  we  assume  a  second  such  representation: 

p  =  A2  +  B2 

(where  a,  b,  A,  B  represent  four  different  positive  numbers),  it  follows  that 


p 2  -  (a2  +  b2){A2  +  B2)  =  ( Aa  +  Bb)2  +  (Ab  +  Ba)2, 


where  either  the  two  upper  signs  or  the  two  lower  signs  are  possible.  Then,  since 
the  product  of  the  two  factors  Aa  +  Bb  and  Aa-Bb  : 

A2a2  -  B2b 2  =  A2(a2  +  b 2)  -  b\A2  +  B2) 


is  divisible  by  p,  one  of  the  factors  must  be  divisible  by  p.  Consequently,  we 
select  the  upper  or  lower  signs  depending  upon  whether  the  first  or  second  factor 
is  divisible  by  p.  Then  either 

Aa  +  Bb  =  p  and  at  the  same  time  Ab  —  Ba  =  0 


or 


Ab  +  Ba  »  p  and  at  the  same  time  Aa  —  Bb  =  0, 

thus,  either  A2b 2  =  B2a2  or  A2 a2  =  B2b2. 

From  the  first  of  these  equations  it  follows  that 

A2  B2  A2  +  B2  , 
a2  ~  b2  ~  a2  +  b2  ~  ’ 


and  from  the  second 


A2  B2  A2  +  B2 
b2  ~  a2  "  b2  +  a2  “  ’ 

thus,  from  the  first  A  =  a,  while  from  the  second  A  =  b,  both  of  which  contradict 
the  initial  assumption,  which  requires  that  A  ±  a  and  A  A-  b.  There  is  therefore 
only  one  way  of  representing  p  as  a  norm,  and  the  Fermat-Euler  theorem  is 
proved. 


20 


The  Fermat  Equation 


Find  the  integral  solutions  of  the  equation 


x 3  —  dy2  —  1, 

in  which  d  is  a  nonquadratic  positive  whole  number. 


This  extremely  important  problem  of  number  theory  was  posed  by  Pierre 


Fermat  in  1657,  first  to  his  friend  Frenicle  and  then  to  all  contemporary 
mathematicians. 

The  first  solution,  a  very  complicated  one,  was  obtained  by  the  Englishmen 
Lord  Brouncker  and  John  Wallis. 

The  simplest  and  best  solutions  to  this  problem  were  discovered  by  Euler, 
Lagrange,  and  Gauss.  [Euler:  “De  usu  novi  algorithmi  ...,”  Novi  Commentarii 
Academiae  Petropolitanae  ad  annum  1765.  Lagrange:  “Solution  d’un  probleme 
d’arithmetique,”  Miscellanea  Taurinensia,  vol.  IV,  1768.  Gauss:  Disquisitiones 
arithmeticae,  1801.]  They  are  all  based  upon  the  properties  of  periodic  continued 
fractions. 

We  will  examine  a  somewhat  modified  form  of  this  method  with  the  more 
general  equation 


X3  -  DY 3  «  4, 

which  includes  the  original  Fermat  equation  (with  X  =  2x,  Y  =  y,  D  =  4 d)  as  a 
special  case,  but  includes  as  well  the  case  in  which  D  leaves  a  residue  of  1  on 
being  divided  by  4. 

For  the  sake  of  convenience  we  shall  write  the  continued  fraction 


a 


in  the  abbreviated  form  (a,  b,  c,  d, 

A  purely  periodic  continued  fraction  with  an  n-term  period  has  the  form 

**  =  {git  g2i  •  •  •  i  §m  gl)  •  •  •  i  gn>  ’  • ')t 


so  that  we  may  write 


«  =  {gugi,- 

where  N  is  an  integral  multiple  of  n,  which  we  will  assume  to  be  even  for 
reasons  presently  to  be  described.  The  terms  (partial  denominators)  gh  g2,  ...  are 

assumed  to  be  positive  whole  numbers  >  0.  If  we  designate  the  numerator  and 
denominator  of  the  Mh  approximation  (g, ,  g2,  ...,  gN)  and  of  the  ( N  -  l)th 

approximation  (gl5  g2,  . ..,  gN  _  T)  as  P  and  Q  and  p  and  q,  respectively,  then 
according  to  continued  fraction  theory  we  obtain  the  two  equations 


(1) 


Pq  —  Qp  =  l  and  (2) 


Pu  +  p 
Qu  +  q 

the  second  of  which  may  also  be  expressed  in  the  form 
(2fl)  Qu 2  —  Hu  —  p  =  0  with  H  —  P  —  q. 

The  discriminant  D  =  H2  +  4 Qp  of  the  quadratic  equation  (2a)  has,  according 
to  (1),  the  value  H 2  +  4 Pq  -  4  =  (P  +  q)2  -  4;  it  is  consequently  smaller  by  4  than 
a  square  number  and  therefore  cannot  itself  be  a  square  number.  Its  (positive) 
root  r  =  V  £)  is  therefore  irrational.  Moreover,  since  r  >  H  (because  r2  =  H2  + 
4Qp),  the  second  root  a  =  (H  -  r)!2Q  of  the  quadratic  equation  is  negative,  so 
that  the  first  root  ( H  +  r)!2Q  represents  our  (improperly  fractionated)  continued 
fraction  u.  To  obtain  information  about  the  magnitude  of  a  we  form  the  product 
of  the  roots  uu  =  -p/Q  and  obtain 


Since  P  >  p  and  Q>  q,  then 


-a<M  and 

U  U 

One  of  the  right-hand  fractions,  however,  is  a  proper  fraction,  since  the  value 
u  of  the  continued  fraction  lies  between  the  two  successive  approximations  p/q 
and  PIQ\  therefore,  -  u  must  be  a  proper  fraction. 

A  quadratic  equation  with  integral  coefficients  and  a  nonquadratic 
discriminant  whose  first  root  is  a  positive  and  improper  fraction  while  the  second 
root  is  a  negative  proper  fraction  is  called  a  reduced  equation,  and  its  first  root  is 
called  a  reduced  number.  Our  conclusion  therefore  reads: 

Every  purely  periodic,  improperly  fractionated,  continued  fraction  is  a 
reduced  number. 

We  will  now  show  conversely  that  the  continued  fraction  of  a  reduced 
number  is  purely  periodic. 

First,  we  will  solve  the  problem: 

Obtain  the  first  root  u  =  (r  -  b)/2a  of  the  quadratic  equation 


(3) 


au2  +  bu  +  c  —  0 


with  integral  indivisible  coefficients  and  the  positive  nonquadratic  discriminant 
D  =  r2  =  b2  -  4ac  in  the  form  of  a  continued  fraction. 

We  write 


u 


-  g  + 


1 


where  g  is  the  largest  whole  number  below  u  (in  the  following  to  be  designated 
as  [u]  and  u'  a  positive  improper  fraction.  We  introduce  three  new  magnitudes  a', 
b\c'  that  are  of  the  opposite  sign  and  equal  to  the  magnitudes  ag2  +  bg  +  c,  lag 
+  b,  and  a ,  and  we  obtain 


,  1  =  2a  =  2 a(r  -  b')  _  r  -  b' 

U  u  —  g  r  +  b'  ra  —  b'%  2a' 


with 


b'2  -  4 a'c'  =  b2  -  4ac  =  D. 

Consequently,  u'  is  the  first  root  of  the  quadratic  equation 
(3')  a'u'2  +  b'u'  +  f'  =  0, 

which  likewise  belongs  to  the  discriminant  D  and  possesses  coefficients  having 
no  common  divisor.  (If  a',  b',  c'  possessed  a  common  divisor,  the  latter  because 
of  the  equations  -  c'  =  a,  -  b',  =  lag  +  b,  -a'  =  ag 2  +  bg  +  c  would  go  into  a,  b,  c, 
which  contradicts  our  assumption.)  We  call  the  new  equation  (3')  the  derivative 
of  the  initial  equation  (3)  and  its  first  root  u'  the  derivative  of  u. 

The  new  coefficients  a',  b\  c'  are  calculated  in  practice  in  accordance  with 
the  following  system  : 


We  add  the  two  terms  of  the  third  column  and  change  the  sign  of  the  sum,  thus 


obtaining  a'.  We  add  the  two  lower  terms  of  the  second  column,  change  the  sign 
of  the  sum  and  get  b'.  We  change  the  sign  of  a  and  get  c'. 

The  derived  quadratic  equation  (3')  is  treated  in  exactly  this  manner  and  the 
process  continued  as  far  as  desired.  The  following  example  is  presented  to  make 
the  process  completely  clear. 

Expand  the  positive  root  of  the  quadratic  equation 


3ua  —  lOu  —1=0 


into  a  continued  fraction.  The  discriminant  is  1 12,  thus  r  =  10,  ....  In  the  scheme 
we  will  write  in  only  the  coefficients  of  the  successive  quadratic  equations  each 
of  which  is  the  derivative  of  the  preceding  one.  In  the  last  column  we  will  write 
the  first  root  of  the  appropriate  equation  and  the  highest  integral  contained  in  it 
that  is  at  the  same  time  the  correct  partial  denominator  of  the  continued  fraction. 


3 

-10 

-1 

10,. 

•  •  +  10 

3  +  ..• 

9 

6 

-1 

-3 

4 

-8 

-3 

10,. 

•  •+  8 

2  +  ... 

8 

8 

0 

0 

3 

-8 

-4 

10,. 

••  +  8 

3  +  ••• 

9 

6 

1 

3 

1 

-10 

-3 

10,- 

•  +I0 

10  +  •  •  • 

10 

2 

0 

0 

3 

-10 

-1 

Since  we  come  back  to  the  initial  equation,  the  expansion  is  purely  periodic, 
and  we  obtain 


— ■126~>~  10  =  (3,  2,  3,  10,  3,  2,  3,  10, . . .). 

Now  for  the  proof  of  the  theorem  that  the  expansion  of  a  reduced  number 
yields  a  purely  periodic  continued  fraction! 

Since  the  first  root  u  of  the  reduced  equation 


au2  +  bu  +  c  =  0 


is  a  positive  improper  fraction,  and  the  second  one,  a,  is  a  negative  proper 
fraction,  then  according  to  the  relations 

_  c  b 

uu  =  -*  u  +  u  - - 

a  a 

between  roots  and  coefficients,  both  the  free  term  c  and  the  coefficient  b  of  the 
linear  term  of  a  reduced  equation  are  always  negative  (the  coefficient  a  is 
assumed  to  be  always  positive). 

In  accordance  with  the  expansion  examined  above  we  write 

(4)  u  -  g  +  i 

with  g  =  [u]  and  u’>  1 .  From  u'=  1  l(u  -  g)  it  follows  initially  that  the  first  root  u' 
of  the  derived  equation  is  a  positive  improper  fraction.  If  we  then  transform  r 
into  -  r  in  the  equation  u'=  1  l(u  -  g),  the  equation  assumes  the  form  u '  =  l/(u  -  g) 
and  shows  that  the  second  root  u'  is  a  negative  proper  fraction.  The  derivative  oj 
a  reduced  equation  or  number  is  consequently  also  reduced ,  so  that  only  reduced 
numbers  occur  in  the  continued  fraction  expansion  of  a  reduced  number. 

If  we  write  (4) 


1 

-y  “  *  -  “■ 

we  see  that  g  can  also  be  taken  as  the  greatest  integer  that  is  contained  in  the 
reciprocal  value  of  opposite  sign  of  the  second  root  of  the  derived  equation. 

Now,  the  number  of  all  the  reduced  numbers  corresponding  to  a  given 
discriminant  D  is  finite.  (From  D  =  b2  -  4 ac  and  -  ac  >  0  it  follows  first  that  the 
b’s  must  be  sought  only  among  the  numbers  of  the  series  -  1,  -  2,  -  [r].  Of 

these  the  only  ones  that  need  be  considered  are  those  for  which  D  -  b2  is 
divisible  by  4.  We  select  these,  and  for  each  such  b  we  determine  the  pairs  of 
numbers  a,  c  [with  a  >  0,  c  <  0]  for  which  -  ac  =  (D  -  b2)/ 4,  which  in  turn  gives 
us  a  finite  quantity  of  numbers  a  and  c.  Each  number  triplet  a,  b,  c  obtained  in 
this  way,  however,  leads  to  a  reduced  equation  au2  +  bu  +  c  =  0  and  thus  to  a 
reduced  number  u  only  when  2 a  lies  between  r  +  b  and  r  -  b .) 

Consequently,  in  the  continued  fraction  expansion  of  a  reduced  number  U 
there  must  reappear  after  a  finite  number  of  steps  a  reduced  number  previously 
obtained,  e.g.,  in  such  manner: 


u  =  (K,  L,  u), 


u  =  (h,  k,  l,  u). 


But  since,  in  accordance  with  the  above,  both  /  and  L  represent  the  greatest 
integer  that  is  contained  in  the  reciprocal  value  of  u  of  opposite  sign,  L  =  l. 
Similarly,  we  find  that  K  =  k. 

Consequently, 


U={k,  l,h,k,l,ht...)t 

i.e.:  The  expansion  of  a  reduced  number  yields  a  purely  periodic  continued 
fraction. 

After  these  preliminaries  the  solution  of  the  Fermat  equation  becomes  quite 
simple.  We  will  show:  1.  that  the  continued  fraction  expansion  of  any  reduced 
number  belonging  to  the  discriminant  D  possesses  an  infinite  number  of 
solutions  of  the  Fermat  equation;  11.  that  every  solution  of  the  equation  is 
obtained  by  this  expansion. 

1.  Let 


u  —  (^l>  ^2>  •  •  •  j  §2)  ■  •  ■  >  §n>  •  •  • ) 

be  the  positive  root  of  the  reduced  equation 
(5)  au2  +  bu  +  c  =  0 

with  the  discriminant  D  and  coefficients  possessing  no  common  divisor.  Also,  let 

Q  =  (£l>£2>  •  •  ->gs) 

be  an  approximation  of  u  and  the  index  number  N  an  even  multiple  of  n,  and  let 

^  =  (gi>  ga>  ■  ■  -tfy-i) 

be  the  preceding  approximate  fraction;  then,  according  to  (2a), 

(5')  Qu2  —  Hu  —  p  =  0  (H  =  P  -  q). 

Since  the  roots  of  (5)  and  (5')  agree  and  the  coefficients  of  (5)  possess  no 
common  divisor,  it  must  be  possible  to  obtain  (5')  from  (5)  by  multiplication 
with  a  certain  whole  number  y,  such  that 


If  we  then  introduce  the  whole  number 


(7)  *  =  P  +  q, 

we  obtain  from  (6)  and  (7) 

x2  -  by  =  (P  +  q)3  -  (P  -  q)3 


and 


4 acy3  —  —4  Qp, 

from  which  by  addition  we  obtain 

x3  -  Dy3  =  4 (Pq  -  Qp), 


and,  using  (1), 

x3  -  Dy3  =  4. 

II.  Conversely,  now  let  x\y  represent  a  solution  of  the  Fermat  equation 
(8)  x3  -  Dy3  =  4 


in  nonevanescent  positive  integers  x  and  y  and  let  u  represent  the  first  root  of  a 
reduced  equation 

au 3  +  bu  +  c  =  0. 


Making  use  of  (6)  and  (7),  we  obtain  the  four  nonevanescent  positive  integers 


x  —  by 
=  ~2— ’ 1 


Q  -  ay,  p  =  -cy. 


q  = 


x  y  by 
2 


(It  is  immediately  obvious  that  Q  and  p  are  such  numbers,  whereas  for  P  and  q  it 
follows  from  equation  (8),  if  we  make  use  also  of  the  equation  D  =  b2  -  4 ac  to 
write: 


(x  +  by)(x  —  by)  x3  —  b3y3  =  4(1  —  acy3)  =  4(1  +  Qp). 

We  are  then  able  to  conclude  from  the  appearance  of  the  nonevanescent  integer 


on  the  right,  which  is  divisible  by  4,  that  the  two  integral  factors  2 q  and  2 P  of 
the  product  on  the  left-hand  side  have  to  be  even  and  not  equal  to  zero.) 
According  to  (8)  they  satisfy  the  equation 

(9)  Pq-Qp=  1. 


If  we  then  replace  the  coefficients  a ,  b,  c  in  the  reduced  equation  with  Q/y,  -(P  - 
q)/y,  -ply,  we  get 


(10) 


Pu  +  p 
Qu  +  q 


Before  we  get  from  here  to  the  continued  fraction  expansion,  we  still  have  to 
prove  that  Q  g  q. 

It  is  true  that  2(j Q  -  q)  =  [2a  -  b\y  -  x.  Since  the  second  root  u  of  the  reduced 
equation  is  a  negative  proper  fraction,  it  follows  that  r  +  b  <  2a  or  2a  -  b  >  r. 
Consequently, 


2{Q  -  q)  >  ry  -  x  =  ( r2y 2  -  xa)l(ry  +  x)  =  -4 /(ry  +  x) 

or  (Q  -  q)  >  -  2 /(ry  +  x).  However,  since  D  =  r2  =  b2  -  4 ac  is  at  least  equal  to  5, 
y  is  at  least  1,  and  x  at  least  3,  it  follows  that  ry  +  x  >  5  and  from  this  Q  -q>- 
0.4,  i.e.,  < Q^q.  Q.E.D. 

We  now  expand  PIQ  into  a  continued  fraction  (yh  y2,  ...,  yv)  with  the  even 
number  of  terms  v  in  such  a  manner  that  between  it  and  the  last  approximate 
fraction  p  'lq '  there  exists  the  relation 

(9')  Pq'  -Qp'-l. 

From  (9)  and  (9')  it  then  follows  by  subtraction  that 

PW  -  f)  -  <?(/>'  -  P)- 

However,  since  q  g  Q,  q'  <  Q,  and  (q'  -  q)  is  divisible  by  Q,  q'  must  equal  q  and 
therefore  p'  must  also  equal  p.  We  then  obtain 


(yi.  Yit  •  •  •  >  y».  “) 


Pu  +  p 
Qu  +  q 


i.  e.,  because  of  (10), 


«  =  (yi.  ya»---»yr,  «)• 


Every  solution  x\y  of  the  Fermat  equation  can  therefore  be  obtained  by  the 
expansion  of  any  reduced  number  u  as  a  continued  fraction. 

Final  result:  The  Fermat  equation 

x2  —  Dy3  =  4 

has  an  infinite  number  of  solutions;  these  can  all  be  obtained  in  accordance  with 
rules  (6)  and  (7)  from  the  approximation  values ,  containing  an  even  number  oj 
periods,  obtained  from  the  expansion  as  a  continued  fraction  of  any  arbitrarily 
selected  reduced  number  belonging  to  the  discriminant  D. 

Example.  Find  the  smallest  solution  x\y  of  the  Fermat  equation 

x3  -  1 12y3  =  4. 

A  reduced  equation  applying  to  the  discriminant  112  is  the  equation  treated 
above 


3ua  -  lOu  -  1  =  0; 


the  expansion  of  the  reduced  number  u  reads 

u  =  (3,2,  3,  10,  3,2,3,  10,...) 

and  has  a  four-termed  period.  The  first  four  approximate  fractions 

3  7  p  _  24  P  247 

f  2’  q~  7  ’  Q  ~  72  ’ 

Since  here  a  =  3,  b  =  -  10,  c  =  -  1,  we  find,  in  accordance  with  (6)  and  (7),  that 

x  =  254,  y  =  24. 

It  now  remains  to  be  shown  that  there  is  at  least  one  reduced  number 
corresponding  to  each  discriminant  D. 

1 .  If  D  =  4n  and  g  is  the  maximum  integer  that  is  contained  in  then 

a  -  1,  b  =  —2 g,  c  =  g2  -  n 

are  the  coefficients  of  a  reduced  equation. 

Proof.  The  discriminant  of  the  equation  is  b2  -  4 ac  =  4 n  =  D.  Moreover, 


r  +  b<2a<r  —  b, 


since 


2  Vn  —  2g  <  2  <  2  Vn  +  2^. 

2.  If  D  =  4«  +  1  and  g  is  the  largest  integer  for  which  g2  +  g  will  be  smaller 
than  n  (so  that  (g  +  l)2  +  (g  +  1)  >  n  or  g2  +  3g  +  2  >  n),  then 

a  «  1,  b  =  -(2g  +  1),  c  =  g2  +  g  -  n 

are  the  coefficients  of  a  reduced  equation. 

Proof.  The  discriminant  of  the  equation  is  b 2  -  4ac  =  4n  +  1  =  D.  Also, 

r  +  b<  2a  <r  —  b> 


since 


VD  -  (2g  +  \)  <  2  <  VD  +  2g  +  \. 

(That  VD  —  2^  —  1  <2  follows  from  the  above  condition  g2  +  3g  +  2  >  «.  On 
multiplication  by  4  this  becomes 

dg2  +  12g  +  9>4«  +  1,  i.e.,  it  becomes  (2g  +  3)2  >  D. 

From  this  it  follows  that 

2g  +  3  >  VD  or  VD  -  2g  -  1  <  2.) 

Note.  If  we  have  found  the  minimal  solution  of  the  Fermat  equation  (e.g., 
by  the  method  just  presented),  we  can  find  the  other  solutions  (we  will  consider 
only  positive  solutions)  in  a  simpler  manner  after  Lagrange. 

We  assign  to  each  solution  x\y  the  “Lagrange  number” 

^  +  yr) 

and  call  x  and  y  the  components  of  the  Lagrange  number. 

We  will  first  prove  the  auxiliary  theorem.  The  product  and  the  improperly 
fractionated  quotient  £  =  +  rjr)  of  two  Lagrange  numbers  Z  =  l(X  +  Yr)  and 

z  =  f(x  +  yr)  is  also  a  Lagrange  number. 

Proof.  We  immediately  find  that 


££  =  1  or  ?  -  Dr,2  =  4 


with 


,  Xx  ±  DYy  Yx  ±  Xy 

*  = - 2 - ’  71  = - 2 - ’ 

where  the  upper  sign  is  used  when  we  are  concerned  with  the  product  and  the 
lower  when  we  are  concerned  with  the  quotient. 

From  X>  rY  and  x>  ry  it  follows  that  Xx  >  DYy,  so  that  is  positive  in  every 
case.  From 


it  follows  in  the  case  of  Q  =  Z/z,  since  then  Y  >  y,  that  XI Y  <  x/y  or  Yx  >  Xy,  so 
that  rj  is  also  positive  in  every  case.  Consequently,  is  positive  and  improper 
because  $  =  1 . 

Now  it  merely  remains  to  show  that  and  rj  are  integers.  Either  D  is  divisible 
by  4  or  D  leaves  a  residue  of  1  on  division  by  4.  In  the  first  case  X  and  x  are 
even.  In  the  second  case  every  solution  of  the  Fermat  equation  consists  either  of 
two  even  or  two  odd  numbers.  In  all  cases  and  rj  are  consequently  integers. 

The  method  mentioned  above  is  based  upon  the  theorem: 

Every  Lagrange  number  is  a  power  of  the  smallest  Lagrange  number  with  an 
integral  exponent. 

Proof.  Let  x\y  be  the  minimal  solution  of  the  Fermat  equation  and  thus  z  =  \ 
(x  +  yr)  the  smallest  Lagrange  number.  First  it  follows  from  the  auxiliary 
theorem  that  every  power  of  z  is  a  Lagrange  number. 

Now  let  Z  =  l(X  Yr)  be  a  Lagrange  number  that  is  not  a  power  of  z.  Then 
there  must  certainly  exist  two  successive  powers  5  =  zn  and  hf  =  z/7  +  1  between 
which  Z  is  situated.  From 


zn  <  Z  <  zn+1 


it  follows  on  division  with  zn  that 


1  <  Z/a  <  z. 

Thus,  the  Lagrange  number  =  Z/S  would  be  smaller  than  the  smallest  Lagrange 
number  z,  which  is  naturally  absurd. 

Consequently,  the  only  Lagrange  numbers  are  the  powers 


z  z^  z^  z* 


And  the  simplest  way  of  finding  the  2nd,  3rd,...  solution  of  the  Fermat 
equation  is  to  find  them  as  components  of  the  Lagrange  numbers  z2,  z3,. . .. 
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The  Fermat-Gauss  Impossibility  Theorem 


Prove  that  the  sum  of  two  cubic  numbers  cannot  be  a  cubic  number. 


Thus,  what  must  be  proved  is  that  the  equation 


x3  +  y3  =  z3 


cannot  be  composed  of  nonevanescent  integers  x,  y,  z. 

The  theorem  that  we  have  to  prove  is  a  special  case  of  the  famous  Fermat 
impossibility  theorem,  which  was  expressed  by  Fermat  in  the  following  way  in 
the  arithmetic  of  Diophantus,  edited  by  Fermat’s  son,  and  published  in  1670: 

“It  is  impossible  to  divide  a  cube  into  two  cubes,  a  fourth  power  into  two 
fourth  powers,  and  in  general  any  power  except  the  square  into  two  powers  with 
the  same  exponents.” 

Fermat  added:  “1  have  discovered  a  truly  wonderful  proof  of  this,  but  the 
margin  (of  the  notebook)  is  too  narrow  to  hold  it.”  Unfortunately,  Fermat 
neglected  to  disclose  this  “wonderful  proof.” 

Fermat’s  impossibility  theorem  became  very  famous  as  a  result  of  the  fact 
that  many  of  the  greatest  mathematicians  since  Fermat,  including  Euler, 
Legendre,  Gauss,  Dirichlet,  Kummer,  and  others  tried  unsuccessfully  to  obtain 
the  general  proof  of  this  theorem.  To  the  present  day  a  proof  of  the  impossibility 
of  the  equation 


xn  +  yn  =  zn 

is  known  only  for  special  values  of  the  exponent  n,  e.g.,  for  the  values  from  3  to 
100,  and  even  this  proof  involves  extraordinary  complications  and  difficulties. 

In  the  following  we  will  limit  ourselves  to  the  simplest  case,  the  case  n  =  3. 
The  impossibility  of  the  equation 


x3  +  y3  =  z3 


was  demonstrated  by  Euler  in  his  algebra,  which  appeared  in  1770,  and  later  by 
Gauss  ( Complete  Works,  vol.  11).  This  problem  shows,  as  it  often  happens  in 
mathematics,  that  the  proof  of  a  more  general  theorem  is  easier  to  obtain  than 
that  of  a  special  case.  To  prove  the  impossibility  of 

(1)  a3  +  b3  =  c3 


for  the  common  integers  a,  b,  c  Euler  had  to  resort  to  a  relatively  complicated 
method;  Gauss,  on  the  other  hand,  proved  simply  and  clearly  the  impossibility  of 
the  more  general  equation 

(2)  «3  +  P  =  y3 


for  any  numbers  a,  /?,  y  of  the  form  xJ  +  yO,  where  x  and  y  are  any  integers, 


J 


1  +  »V 3 
2 


and 


0  = 


1  -  »V 3 


2 


are  cube  roots  of  the  (negative)  unit. 

For  convenience  in  notation  we  will  call  numbers  of  the  form  xJ  +  yO  (in 
which  x  andy  are  integers)  G-numbers. 

That  the  case  treated  by  Euler  is  simply  a  special  case  of  (2)  is  apparent  from 
the  fact  that  every  integer  g  is  also  a  G-number:  g  =  gJ  +  gO. 

The  G-numbers  (which  are  the  integers  of  the  so-called  group  of  the  cubic 
unit  roots)  have  many  properties  in  common  with  common  integers.  Readers 
unfamiliar  with  these  properties  will  find  all  the  information  necessary  for  an 
understanding  of  the  Gauss  proof  in  the  supplement  provided  on  p.  100. 


Gauss’  Proof  of  the  Impossibility  of  the  Equation 

(2)  a3  +  =  y3. 


First,  let  Greek  letters  designate  G-numbers  and  small  Roman  letters  common 
integers. 

We  then  replace  a,  /?,  y  with  <f,  rj,  -  £  transforming  (2)  first  into  the 
symmetrical  equation 

(3)  e  +  I?3  +  £3  =  0, 

of  which  we  assume  that  two  of  the  three  “bases”  <f,  rj,  C  will  always  have  no 
common  divisor;  we  will  then  refer  to  this  equation  as  a  Gauss  equation.  [The 


assumption  we  have  just  made  in  no  way  limits  the  proof.  If,  for  example,  £  and 
rj  possessed  a  common  prime  factor  3,  then,  in  accordance  with  (3),  5  would  also 
go  into  <T3  and  consequently  into  f  so  that  division  by  S3  would  eliminate  the 
divisor  3  from  (3).] 

The  impossibility  of  (3)  is  obtained  from  the  two  following  theorems,  which 
we  will  derive  from  the  assumption  of  the  existence  of  (3). 

I.  In  every  Gauss  equation  one  and  only  one  of  the  three  bases — we  will  call 
it  the  special  base — has  the  prime  divisor  jz  =  J  -  O. 

II.  For  every  Gauss  equation  there  is  a  second  Gauss  equation  in  which  the 
special  base  contains  the  divisor  n  fewer  times  than  the  special  base  of  the  first 
equation. 

These  two  theorems,  however,  contradict  each  other.  By  continued 
application  of  II.  it  is  possible  to  obtain  a  Gauss  equation  that  no  longer  contains 
a  special  base,  which  contradicts  theorem  I. 

Proof  of  I.  If  none  of  the  three  bases  <f,  q,  f  were  divisible  by  n,  then 

F  =  e,  7j3  =  /,  t3  =  g  mod  9  with  e 3  =  f*  =*  g2  =  1 

and  consequently,  because  of  (3),  e  +  f  +  g  =  0  mod  9,  which  is,  however, 
impossible.  Therefore  a  situation  such  as  the  following  must  exist : 

C  =  0  mod  it,  $  &  0  mod  it,  tj  &  0  mod  n. 

Proof  of  II.  It  follows  from  if  =  mod  n3,  according  to  (3),  that  +  q3  =  0 
mod  7r3,  and  since  <f  =  e  mod  9,  q3  =\  f  mod  9,  e  +/=  0  mod  k3,  then  e  +f=  0 
mod  3  must  be  true;  from  this  it  follows  that /=  -e.  Now  <f  +  q3  =  e+  f=0  mod 
9,  and  consequently  <f  =  0  mod  and 

£  =  0  mod  tt2. 

From  £3  +  q3  =  0  mod  n3  and  the  identity 

t3  +  v3  =  Hoc, 


where 


<p  =  iJ  +  yO,  if>  =  (0  +  qJ,  x  =  £  +  v* 

it  follows  that  at  least  one  of  the  factors  cp,  ip,  /  is  divisible  by  n.  From  this  and 
from  tp  —  tp  =  (£  —  <p  -f  </r  =  x  it  follows  that  each  one  of  the  factors  (p ,  ip. 


X  is  divisible  by  iz  so  that 


<p  =  n<p>  <fr  =  ™l>'>  x  =  nx- 

Thus  no  pair  of  the  numbers  zp',  yj\  /'  possesses  a  common  divisor. 

[If,  for  example,  cp'  and  i//  possessed  a  common  divisor  S,  then  also  zp'  -  y/' 
would  equal  -  rj  and  *•(?'  +  0')  =  £  +  vi,  and  then  also  2£  and  2;/  would  be 
divisible  by  <5,  so  that  S  would  be  equal  to  2.  Then  we  would  either  have  £  =  22  + 
e,  77  =  2p  +  e,  or  £  =  22  +  e,  rj  =  2p  -  e,  with  £3  =  ±  1  and  then  cp  =  2v  +  £  or  cp  =  2v 
+  £7r,  which,  however,  is  not  divisible  by  S  =  2.] 

If  we  now  set  C/7 r  =  <0.,  then 

co3  =  —  (p^'x  with  <p'  +  tf/'  =  x  . 

Since  then  no  pair  of  cp',  y/\  -  /  possesses  a  common  divisor,  these  three 
magnitudes  down  to  the  possible  unit  factors  a,  /?,  y  must  be  cubes  of  the 
numbers  p,  o,  r  no  pair  of  which  possesses  a  common  divisor: 

<p'  =  ctp3}  >p'  =  j3o3,  ~X  —  Y*3  with  a8  =  /38  =  y8  =  1, 


so  that 

(4)  o>3  *  aj8 yp3o3r3,  op3  +  /3a3  +  yr3  =  0. 

However,  if  the  cube  of  K  =  col  pox  is  the  G-unit  a,  fi,  y  then,  since  K3  =  E  mod 
9,  afiy  =  E  mod  9  also,  and  consequently 

0/3 y  =  E  with  E2  =  1. 

From  co  =  0  mod  tz  it  follows,  for  example,  that 

r  =  0  mod  n  and  p  &  0,  of I  0  mod  n. 

Then,  however,  p3  =  e  and  o3  =  f  mod  9  {e2  =  f  =  1),  and  consequently, 
according  to  (4),  ea  +ffi  =  0  mod  3,  and  from  this  ea+ffi  =  0.  Thus,  we  obtain 

/3  —  Fa,  Fa2y  =  E  (with  F2  =  1) 


and  from  (4) 


Fa3p3  +  oV  +  Et3  =  0. 


If  we  write  here  <f,'  rj,'  C  in  place  of  Fap,  aa,  Et,  respectively,  we  finally  obtain 
the  Gauss  equation 

(3')  f'3  +  V3  +  C'3  =  0, 

into  the  special  base  C  of  which  the  factor  jt  goes  fewer  times  than  into  the 
special  base  C  of  (3). 


Supplement.  Properties  of  G-numbers 

I.  The  magnitudes  J  and  O  satisfy  the  following  equations: 

J  +  0  «=  1,  JO  -  1,  J2  +  0  =  0, 

O2  +  J  =  0,  J3  =  -1,  O3  =  -1. 

II.  The  sum,  difference,  and  products  of  G-numbers  are  also  G-numbers. 

The  product  of  the  two  numbers  aJ  +  bO  and  a'J  +  b'O  is,  for  example 
(according  to  l.),  pJ  +  qO  with 

p  =  ab'  +  ba'  —  bb'  and  q  =  ab'  +  ba  —  aa' . 

III.  Norm.  The  norm  of  a  complex  number  §  =  £  +  ?*)  is  commonly 
understood  to  be  the  product 


Jo  =  tf(j)  =  hi  =  (l  +  -  it))  =  j2  +  t)2 

of  the  two  mutually  conjugate  numbers  b  and  l  =  g  —  it). 

The  norm  of  the  G-number  aJ  +  bO  accordingly  has  the  value  a2  +  b2  -  ab.  It 
is  a  positive  integer  which  disappears  only  when  a  and  b  are  both  zero.  The 
smallest  conceivable  norms  of  G-numbers  are  1,  2,  3. 

From 


a2  +  b3  —  ab  =  1 

we  obtain  one  of  the  six  following  cases: 


a  = 

1 

0 

-1 

0 

1 

-1 

b  = 

0 

1 

0 

-1 

1 

-1 

There  are  thus  six  G-numbers: 


with  the  norm  1 . 
The  equation 


J,  —J,0,  — 0,  1,  —  1 


a2  +  b2  -  ab  =  2 

has  no  solution  that  is  an  integer.  There  is  consequently  no  G-number  whose 
norm  is  2. 

The  equation 


a2  +  b2  -ab  =  3 

finally  has  six  integral  solutions 

a  =  \,  b  —  —  1 ;  a  =  —  1,  6=1;  a  =  1,  6  =  2; 

a  =  -\t  6  =  -2;  a  =  2,  6=1;  a  =  -2,  6  =  — 1. 

Accordingly,  there  are  six  G-numbers  with  the  norm  3,  the  numbers 
it  —  J  —  0  —  iV 3,  ttJ,  ttO,  and  their  conjugates  tt  =  -tt,  -i nO,  -ttJ 

The  norm  of  the  product  of  two  numbers  is  equal  to  the  product  of  the  norms 
of  these  numbers. 

Proof.  ^(a0)  =  rf-ap  =  a/3-a-/j  =  aa-j3^  =  N(a)- 

IV.  Units.  A  G-number  s  is  called  a  unit,  or  more  accurately  a  G-unit,  when 
its  reciprocal  value  //  is  also  a  G-number.  From  srj  =  1  it  follows  from  norm 
formation  that  e0?/0  =  1,  i.e.,  e0  =  1.  According  to  Ill.,  there  are  consequently  six 

G-units: 

J,  -J,  0,  -0,  I,  -1. 

These  six  units  are  the  integral  powers  of  J  or  O,  e.g.,  J,  J2,  J3,  J4,  J5,  and  J6. 

V.  Associated  numbers.  The  six  numbers  that  are  obtained  when  a  G-number 
C  is  multiplied  by  the  six  G-units  are  called  the  associated  numbers  of  C 

The  six  associated  numbers  of  n  =  J-  0  are,  for  example, 

nJ  =  —  1  —  0,  ttJ2  =  —  1  —  J,  ttJ3  =  —it, 

irj*  =1+0,  irJ5  =  1  +  J,  irJ6  =  tt. 

VI.  Division.  The  quotient  q  =  a//?  of  two  G-numbers  a  and  f  is  not 
necessarily  a  G-number.  If  it  is  a  G-number,  however,  f  is  called  a  divisor  (G- 


divisor)  of  a  or  one  says  that  goes  into  j 3 . 

In  order  to  divide  any  G-number  a  by  any  other  f  we  write 

a  «P  «P  hJ  +  kO  ±  k_ 

P  PP  Po  /So  P*  Pc 

Here  we  divide  each  rational  fraction  h/f0  and  k/f{)  into  the  integral  components 

m  and  n,  respectively,  and  the  rational  components  r  and  3,  respectively,  the 
absolute  value  of  which  never  exceeds  \  [Example:  ^  =  4  —  0.2]],  we  set 
mJ  +  nO  —  k,  r J  +  $0  =  JR,  and  obtain 

3  =  *  +  JR  or  a  =  */ 3  +  JR/3. 

P 

From  sJfyS  =  a  —  */?  it  follows  that  JRj 3  is  a  G-number  y,  and  we  have 

a  =  Kp  +  y. 

Here  y0  =  JR^  =  (t2  +  32  —  r3)/30.  Since,  however,  |t|  ^  ^  and  |g|  ^  ±  then  JR0 
must  certainly  be.  ^i,  i.e.,  fo-^Ao 

Conclusion.  77ze  division  of  a  G-number  a  by  another  G-number  ft  results 
in  a  “ quotient ”  k  and  a  “ residue ”  y  such  that 

«  -  *P  +  y» 

with  the  residue  norm  being  at  most  equal  to  \  of  the  divisor  norm. 

VII.  The  algorithm  of  the  greatest  common  divisor.  We  start  with  the  division 
oJf>  and  the  related  equation 

(1)  a  =  Kfi  +  y  with  y0  £  $&>* 

and  determine,  as  in  VI.,  the  quotient  X  and  the  residue  S  of  the  division  fly,  in 

this  way  we  obtain  the  corresponding  equation 

(2)  P  -  Ay  +  5  with  S0  £  $y0. 

Then  in  a  similar  manner  we  obtain 

(3)  y  =  /xS  +  e  with  e„  ^  $80, 

etc.  Since  the  residue  norms  become  progressively  smaller,  we  must  finally 


obtain  a  residue  of  zero.  To  avoid  unnecessary  writing  we  will  assume  that  the 
division  after  (3)  5/s  leaves  no  residue,  so  that 

(4)  8  =  ve. 

Now  it  follows  from  (4)  that  every  divisor  r  of  s  also  goes  into  y  without 
residue,  and,  therefore,  it  follows  from  (3)  that  r  also  goes  into  y  without  residue; 
consequently,  it  follows  from  (2)  that  r  goes  into  ft  without  residue,  and,  finally, 
from  (1)  it  follows  that  r  goes  into  a  without  residue. 

In  reverse  order  :  it  follows  from  (1)  that  every  common  divisor  r  of  a  and  / 3 
is  also  a  divisor  of  y  then,  from  (2),  that  x  also  goes  into  5  without  residue,  and, 
finally,  from  (3),  that  r  is  also  a  divisor  of  s. 

Every  common  divisor  of  a  and  f  consequently  goes  into  s  without  residue, 
and  every  divisor  of  s  goes  into  a  and  f  without  residue. 

e  is  accordingly  (in  terms  of  its  absolute  value)  the  highest  common  divisor  oj 
a  and  f. 

If,  in  particular,  £  is  a  G-unit,  the  numbers  a  and  f  are  said  to  have  no 
common  divisor  or  to  be  prime  with  respect  to  each  other. 

The  chain  of  equations  (1),  (2),  (3),...  is  nothing  other  than  the  extension  to 
G-numbers  of  the  well-known  algorithm  for  determination  of  the  highest 
common  divisor  of  common  integers. 

VIII.  Unequivocal  division  of  G-numbers  into  prime  factors.  Just  as  with 
integers,  the  common  theorems  governing  divisibility,  indivisibility  and 
unequivocal  division  into  prime  factors  are  derived  from  the  divisional 
algorithm: 

1 .  If  a  and  ft  possess  no  common  divisor  and  ap  is  divisible  by  y,  then  p  is 
divisible  by  ft. 

2.  If  two  G-numbers  possess  no  common  divisor  with  one  and  the  same  third 
G-number,  their  product  also  possesses  no  common  divisor  with  this  third  G- 
number. 

3.  Every  G-number  can  be  divided  into  a  product  of  prime  factors  (i.e.,  G- 
primes)  in  only  one  way.  [Divisions  such  as  afiy  and  aJf-yO,  in  which  one 
contains  the  associated  numbers  of  the  other  rather  than  certain  factors  of  it,  are 
not  considered  different  from  each  other.] 

A  G -prime  is  a  G-number  that  possesses  no  divisor  aside  from  its  six 
associated  numbers  and  the  six  units. 

The  numbers  n  =  J-0  and  2  are,  for  example,  primes. 

If,  for  example,  we  assume  that  n  is  divisible:  n  =  Xp,  then  7r0  =  X^  or  3  = 


Xq/iq.  From  this  it  follows  that  20  =  3,  p0  =  1 .pi,  is  therefore  a  unit  and  the 
equation  n  =  Xpi  does  not  represent  a  division. 

From  2  =  Xpi  it  follows  that  2  =  X^piQ  or  4  =  20 g0.  The  case  of  20  =  2,  pi0  =  2  is 
eliminated  because,  according  to  III.,  there  is  no  G-number  having  a  norm  equal 
to  2. 

Thus,  we  are  left  with  X0  =  4,pi0  =  1 .  Once  again  pi  is  a  unit  and  the  equation  2 
=  Xpi  does  not  represent  a  division. 

IX.  Congruence.  As  in  the  theory  of  natural  numbers,  we  say  here  also  that 
two  G-numbers  a  and  /?  are  congruent  modulo  ju — written  o.  =  (J>  mod  ju — when 
their  difference  a-  /3  is  divisible  by  the  G-number  ju. 

X.  G-numbers  modulo  n.  We  will  consider  one  more  G-number  k  =  aJ  +  bO 
in  relation  to  the  modulus  tz  =  J  -  O. 

If  k  is  divisible  by  n  : 

aJ  +  bO  =  (i mJ  +  nO)(J  -  0)  =  (2 n  -  m)J  +  (n  -  2m)0, 

then  a  =  2n-m,  b  =  n  —  2m,  thus 

a  +  b  =  3g  with  g  =  n  —  m. 

Conversely,  if  a  +  b  =  3g,  m  and  n  are  determined  from  n  -  m  =  g  and  2 n  -  m  = 
a,  giving  K  =  (mJ  +  nO){J-  O ). 

The  G-number  k  =  aJ  +  bO  is  thus  divisible  by  n  only  when  a  +  b  is  divisible 
by  3. 

If  k  is  not  divisible  by  n,  then  one  of  the  three  following  formula  pairs  is 
valid: 


a  =  3h,  b  —  3k  +  e\  a  =  3A  +  e,  b  —  3kmt 
a  —  3h  +  ef  b  =  3k  -f  e, 

with  e2  =  1,  and  thus,  if  hJ  +  kO  is  set  equal  to  2, 

k  =  3\  +  eO  or  k  =  3A  +  tJ  or  k  =  3A  +  e. 


so  that  in  every  case  k  has  the  form 

k  =  3A  +  «, 

where  s  is  a  G-unit. 

Let  us  now  consider  the  cube  of  k.  It  becomes 


9(3A3  +  3A2e  +  A*2)  +  e3, 


K 


3  D 


and,  because  s3  =  ±  1,  it  has  the  form 

K3  =  9fl  ±  1. 

If  k  is  not  divisible  by  it  we  then  have  the  congruences  k  =  e  mod  3,  k3  =  ±1 
mod  9 
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The  Quadratic  Reciprocity  Law 


(The  Euler-Legendre-Gauss  theorem .)  The  reciprocal  Legendre  symbols  oj 
the  odd  prime  numbers  p  and  q  are  governed  by  the  formula 


-n/ai-Kfl-n/a) 


This  law,  the  so-called  quadratic  reciprocity  law,  was  formulated  but  not 
proved  by  Euler  (Opuscula  analytica,  Petersburg,  1783).  In  1785  Legendre 
discovered  the  same  law  (Histoire  de  l  Academie  des  Sciences )  independently  of 
Euler  and  proved  it  partially. 

The  first  complete  proof  was  presented  by  Karl  Friedrich  Gauss  (1777-1855) 
in  his  famous  Disquisitiones  arithmeticae  (published  in  1801),  a  book  that  laid 
the  foundations  of  contemporary  number  theory;  this  work,  its  five  hundred 
quarto  pages  swarming  with  profound  ideas,  was  written  when  Gauss  was  20 
years  old.  “It  is  really  astonishing,”  says  Kronecker,  “to  think  that  a  single  man 
of  such  young  years  was  able  to  bring  to  light  such  a  wealth  of  results,  and  above 
all  to  present  such  a  profound  and  well  organized  treatment  of  an  entirely  new 
discipline.” 

Later  Gauss  discovered  seven  other  proofs  of  the  reciprocity  theorem.  (The 
Gauss  proofs  may  be  found  in  vol.  14  of  Ostwald’s  Klassiker  der  exakten 
Wissenschaften .) 

The  quadratic  reciprocity  law  is  one  of  the  most  important  theorems  of 
number  theory.  Gauss  called  it  the  “ Theorema  fundamental e.”  The  American 
mathematician  Dickson  says  in  his  Theory  of  Numbers:  “The  quadratic 
reciprocity  law  is  doubtless  the  most  important  tool  in  the  theory  of  numbers  and 
occupies  the  central  position  in  its  history”. 

The  importance  of  this  law  led  other  mathematicians  like  Jacobi,  Cauchy, 
Liouville,  Kronecker,  Schering,  and  Frobenius  to  investigate  it  after  Gauss  and 


offer  proofs  of  it.  In  his  Niedere  Zahlentheorie,  P.  Bachmann  cites  no  fewer  than 
52  proofs  and  reports  on  the  most  important. 

Probably  the  simplest  of  all  the  proofs  is  the  following  arithmetic- geometric 
proof,  which  arises  from  the  combination  of  the  so-called  lemma  of  Gauss 
(Gauss’  Werke,  vol.  II,  p.  51)  and  a  geometric  idea  of  Cayley  (Arthur  Cayley 
[1821-1895],  Collected  Mathematical  Papers,  vol.  II). 

Before  taking  up  the  proof  itself  we  will  give  the  derivation  of  Gauss’  lemma. 
Let  p  be  an  odd  prime  number  and  D  an  integer  that  is  not  divisible  by  p.  If  x 
represents  one  of  the  numbers  1,2,  3  ,...,p  =  (p-  l)/2,  Rx  the  common  residue  of 
the  division  Dx/p,  gx  the  corresponding  integral  quotient,  then 

(1)  Dx  =  Rx  +  gxp. 


Accordingly  as  Rx  is  smaller  or  greater  than  \p,  we  set  Rx  =  px  or  Rx  =  px  +  P, 
where  in  the  second  case  px  represents  the  negative  minimum  residue  of  the 
division  Dx/p,  and  we  obtain 

(la)  Dx  =  Px  +  gxp  or  (16)  Dx  =  px  +  p  +  gxp. 

If  n  is  then  the  number  of  negative  minimum  residues  occurring  in  the  P 
divisions  Dx/p  (for  x  =  1,  2,  3,  . . .,  *>),  we  have  n  equations  of  the  form  (lh)  and  m 
=  p-n  equations  of  the  form  (la). 

We  convert  these  equations  into  congruences  mod  p  and  obtain  the  P 
congruences 

(2)  Dx  =  px  mod  p. 


Now  the  p  residues  px  agree,  except  with  respect  to  sign  and  sequence,  with 
the  p  numbers  1  to  P. 

[If,  for  example,  pr  were  equal  to  ps  or  pr  =  -  ps  for  two  different  values  r  and 
s  of  x,  then  Dr  =  pr  and  Ds  =  ps  would  yield  by  subtraction  or  addition, 
respectively,  D(r  +  s)  =  O  mod  p.  This  congruence  is,  however,  impossible, 
because  neither  D  nor  r  +  s  is  divisible  by  p.] 

Multiplication  of  the  p  congruences  (2)  results  in 

D*p\  =  (—  l)"p!  mod />, 


and  from  this  we  obtain 


D“  =  (—1)"  mod p. 


However,  since,  according  to  Euler’s  theorem  (No.  19), 

v  =  mod  p, 


we  obtain 


(?)  ^  (-I)*mod/’» 

whence,  since  both  sides  of  this  congruence  have  the  absolute  value  1, 

(3)  (7)  =  <-■>*• 

This  formula,  in  which  n  represents  the  number  of  negative  minimum  residues 
resulting  from  the  p  divisions  Dxlp  (x  =  1,  2,  3,  ...,P),  is  Gauss’  lemma. 

Now  let  D  be  some  odd  prime  number  q  that  differs  from  p.  We  convert  the  P 
equations  (la)  and  (lb)  into  congruences  to  the  modulus  2,  leave  out  all  the 
excess  multiples  of  2,  e.g.,  (q  -  l)x,  and  obtain 

x  =  px  +  gx  mod  2  and  x  s  1  +  px  +  gA  mod  2. 

Addition  of  these  p  congruences  yields 

2*  =  n  +  2pm  +  2gx  mod  2. 


However,  since  the  absolute  values  of  px  are  in  agreement  with  the  numbers  1 
through  p  and  each  summand  can  be  replaced  by  its  opposite  value  in  a 
congruence  mod  2,  we  will  write  X*  in  the  obtained  congruence  instead  of  Jj)x 
and  -  n  instead  of  n,  thereby  obtaining 

2*  +  n  s  +  XSx  mod  2 


or 

(4)  n  s  J.gx  mod  2. 


In  accordance  with  (4)  we  can  now  write  (3)  as 


=  M)*». 


Now  gx  is  the  greatest  integer  contained  in  the  quotient  qx/p.  If  we  designate  this 
as  [qx/p],  we  obtain  at  last 

(i)  (|)  -  (-1)°"'”, 

where  x  passes  through  all  the  integers  from  1  to  p  =  (p  -  1)12. 

Accordingly, 


(II) 


where  y  passes  through  all  the  integers  from  1  to  q  =  (q  -  l)/2. 
Multiplication  of  (I)  and  (II)  gives  us 


(III) 


=  ^  _  j  ^a(«/p)xi + 


The  exponent  of  the  right-hand  side  is,  however,  easily  found. 


On  a  system  of  rectangular  coordinates  xy  we  draw  the  rectangle  with  the 
four  angles 


0|0, 


P  q 
2  2’ 


0 


1 

2 


and  bisect  it  with  a  diagonal  d  from  the  origin,  possessing  the  equation  y  = 
(qx/p);  we  then  mark  off  all  the  lattice  points*  within  the  rectangle.  (Cf.  the 
figure,  in  which  p=  19 ,q  =  11.) 


To  begin  with,  it  is  clear  that  no  marked  lattice  point  x\y  lies  on  d,  since  here  x 
would  necessarily  be  <  \p  and  y  <  \q,  which  contradicts  the  condition  y\x  =  q/p. 

For  an  integral  abscissa  x  the  corresponding  ordinate  of  d  is  y  =  ( qx/p )  and  the 
number  of  marked  lattice  points  lying  on  this  ordinate  is  [qx/p].  Consequently, 
the  number  of  the  marked  lattice  points  lying  in  the  lower  half  of  the  rectangle  is 
Z [#*//>],  where  x  passes  through  all  the  integers  from  1  to  p. 

Similarly,  the  number  of  all  the  marked  lattice  points  lying  in  the  upper  half 
of  our  rectangle  is  'Zlpy/q],  where  y  passes  through  all  the  integers  from  1  to  q. 

The  exponent  appearing  in  (III)  is  then  the  number  of  all  the  marked  lattice 
points  in  our  rectangle.  This  is  a  total  of  p  ■  q  elements.  Consequently, 


-(-!)" 


or 


(-1) 


Q.-E.D. 
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Gauss’  Fundamental  Theorem  of  Algebra 


Every  equation  of  the  nth  degree 


z"  +  C\zm-1  +  C2z*~ 2  +  . . .  +  c„  =  0 


has  n  roots. 

Expressed  more  precisely,  this  theorem  reads: 
The  polynomial 


f(z)  =  2"  +  CjZ-1  +  caz-a  +  . . .  +  C„ 

can  always  be  divided  into  n  linear  factors  of  the  form  z  -  av 

This  famous  theorem,  the  fundamental  theorem  of  algebra,  was  first  stated  by 
d’Alembert  in  1746,  but  only  partially  proved.  The  first  rigorous  proof  was  given 
in  1799  by  Gauss,  then  twenty-one  years  old,  in  his  doctoral  dissertation 
Demonstratio  nova  theorematis  omnem  functionem  algebraicam  rationalem 
integram  unius  variabilis  in  factores  reales  primi  vel  secundi  gradus  resolvi 
posso  (Helms  taedt,  1799).  Subsequently,  Gauss  gave  three  other  proofs  of  this 
theorem.  All  four  are  to  be  found  in  the  third  volume  of  his  Works,  as  well  as  in 


vol.  14  of  Ostwald’s  Klassiker  der  exakten  Wissenschaften.  Other  authors  after 
Gauss,  including  Argand,  Cauchy,  Ullherr,  Weierstrass,  and  Kronecker  also  gave 
proofs  of  the  fundamental  theorem.  The  proof  followed  here  (as  modified  by 
Cauchy)  is  Argand’s  {Annales  de  Gergonne,  1815),  which  is  distinguished  by  its 
brevity  and  simplicity. 

This  proof  (like  most  of  the  other  proofs)  falls  into  two  steps.  The  first — and 
more  difficult — step  merely  demonstrates  that  an  equation  of  the  nth.  degree  will 
always  contain  at  least  one  root;  the  second  step  shows  that  it  has  n  roots  and  no 
more. 


First  Step 


We  set 


2"  +  CiZ"-1  +  C2zn~ 2  +  .  •  •  +  C,  =  /(z)  =  w 

and  consider  the  different  values  that  are  assumed  by  the  absolute  magnitude  |w| 
when  z  is  moved  in  the  Gauss  plane  (the  plane  of  complex  numbers).  Let  the 
smallest  of  these  values  be  pi  and  let  it  be  attained,  for  example,  at  the  site  z0,  so 
that  j f[z0)  |  =  |w0|  =  p. 

There  are  two  possible  cases: 

1.  The  minimum  p  is  greater  than  zero. 

2.  The  minimum  pi  is  equal  to  zero. 

We  will  begin  by  considering  the  first  case.  In  the  immediate  vicinity  of  the 
point  z0,  say,  in  the  area  defined  by  a  small  circle  K  of  radius  R  with  a  center  at 

z0,  |vp|  is  everywhere  >u,  since  pi  represents  the  smallest  value  of  |w|;  at  z0  itself 
|w|  =  |w0|  =  p. 

For  any  z  in  K,  z  =  z0  +  £  where  Q  =  p(cos  fr  +  i  sin  and  p  is  the  absolute 
magnitude  of  C,  i  c.,  the  line  segment  z()z,  and  &  the  inclination  of  this  segment 
toward  the  axis  of  the  positive  real  numbers.  We  calculate 

«  -/(*)  -/(*>  +  0  =  fa,  +  0m  +  Ci(*o  +  0 -1  +  •  •  •  +  C„ 

eliminating  the  parentheses  and  arranging  according  to  increasing  powers  of  C  In 
this  way  we  obtain 

w  =/(z)  =  zS  +  QzS-1  +  Caz5'9  +  •  •  •  +  Cn 

+  fxC  +  ra£a  +  •  •  • 


W  =  /(*o)  +  +  <aCa  +  •  •  •  +  C.{*- 


i.e., 


Since  several  coefficients  cr  may  be  equal  to  zero,  we  call  the  first  of  the 
nonevanescent  coefficients  c,  the  second  c',  and  so  forth,  so  that 

W  —  Wq  +  +  c ■+•  c 


with  v  <  v'  <  v”  .... 

Division  with  w0  and  isolation  of  Q’  yields 

1  +?{’•(!  +«), 

Wq 

where  q  =  c/w0  and  Q  represents  a  sum  of  different  powers  of  <T  with  positive 
exponents  and  known  coefficients. 

We  consider  the  product  qC’-(  1  +  Zg).  We  write  the  first  factor 
trigonometrically,  abbreviating  cos  <p  +  i  sin  (p  to  \r  and,  from  q  =  h{ cos  X  +  i  sin 

)Z)  =  h-  1 ;  and  C  =  P  '  we  obtain  qC’  =  h  \K- pv-\^  -  hpv-\K  +  v&.  From  now  on  we 

confine  ourselves  to  z- values  of  K  for  which  X  +  v&  =  n,  which  consequently  lie 
on  the  radius  ZqH  which  forms  the  angle  &  =  {n  -  X)/v  with  the  real  axis.  For  all 

these  z’s  the  number  1 ;  +  =  \n  has  the  value  -1,  and  our  product  assumes  the 

form  -  V  O  +  if). 

If  we  choose  a  sufficiently  small  radius  R,  the  second  factor  1  +  can  be 
brought  as  close  to  unity  as  we  desire,  since  p  =  \Z\  <  R.  But  this  means  that  the 
product  lies  as  close  as  desired  to  the  value  -  hpv,  i.e.,  the  fraction 

£■-  1  -  V-O  +  if) 

Wq 

lies  as  close  as  we  desire  to  the  point  1  -  hpv  of  the  Gauss  plane,  which  shows 
that  for  all  z’s  between  z0  and  H  the  absolute  magnitude  |vv/vv0|  <  1.  In  other 

words,  for  this  z,  \w\  <  p,  while  for  all  z’s  in  the  vicinity  of  z0,  |w|  should  be  ^  p. 
This  is  a  contradiction,  and  consequently  the  first  of  the  two  possible  cases  given 
above  (p  >  0)  is  eliminated.  This  leaves  only  the  second  case:  w0  is  equal  to  zero 
or 


/(* o)  -  0. 


Therefore:  Every  equation  regardless  of  its  degree,  has  at  least  one  root. 

Second  Step 


We  begin  with  the  demonstration  of  the  auxiliary  theorem:  If  an  algebraic 
equation  f(z)  =  0  has  the  root  a,  then  the  left  side  of  the  equation  can  be  divided 
by  z  -  a  without  a  remainder. 

If  we  divide  the  polynomial  f{z)  by  z  -  a  until  the  remainder  R  no  longer 
contains  any  more  z,  we  obtain 


m 

z  —  a 


-  Mz)  + 


R 


Z  —  a 


where  R  is  a  constant  and  f  (z)  has  the  form 


z"-1  +  GiZ*-2  +  Gazn_3  +  •  ■  •  + 


Multiplication  with z-a  gives 

/(z)  =  (z  -  o)/,(z)  +  R. 


If  in  this  equation,  which  is  valid  for  every  z,  we  set  z  =  a,  we  obtain 


R  =/(«)  =  0 


and  thus  for  every  z 


/(z)  =  (z  -  a)fx (z).  Q.E.D. 

If  we  combine  this  auxiliary  theorem  with  the  theorem  proved  in  the  first 
step,  which  demonstrated  the  existence  of  one  root,  we  obtain  the  new  theorem: 
Every  polynomial  of  z  can  be  represented  as  the  product  of  a  linear  factor  z  -  a 
with  a  polynomial  one  degree  lower. 

We  now  write  a1  rather  than  a  and  obtain 


/(z)  =  (z  -  aO/dz). 


We  then  apply  the  obtained  theorem  to  the  polynomial  f(z)  and  get 


/i(z)  =  (z  ~  “aX/aW, 


where  f2(z)  is  of  the  ( n  -  2)th  degree  and  a2  is  a  root  of  the  equation  f  =  0.  Also 
in  similar  fashion: 


f2(z)  =  (z  -  «3)/3(z), 
/3(z)  =  (z  -  a*)Mz),  etc. 


In  this  chain  of  equations,  beginning  with  the  next  to  last,  if  we  replace  every  J 
on  the  right-hand  side  with  its  following  value  in  the  equation  below,  we  finally 
obtain  the  theorem  for  the  transformation  of  a  polynomial  of  the  nth  degree  into 
a  product  of  n  linear  factors: 


/(z)  =  (z  -  otj)(z  -  «a )  . . .  (z  -  an). 

Expressed  verbally:  Every  integral  rational  function  of  the  nth  degree  can  be 
represented  as  the  product  of  n  linear  factors. 

Thus,  the  previous  equation^)  =  0  allows  us  to  write 

(z  -  Oi)(2  -  «a)  •  •  •  (z  -  a»)  =0. 

However,  the  product  on  the  left  becomes  zero  only  when  one  factor  is  equal  to 
zero.  And  since  z-av  =  0  implies  z  =  av,  we  finally  obtain: 

The  equation  f(z)  =0 possesses  the  n  roots  a]_,a2,  ...,  an  and  no  others. 

Thus  we  have  proved  the  fundamental  theorem. 

Note.  It  is  possible  for  several  of  the  n  roots  cq,  a2,  ...,  an  to  be  equally 
great,  for  example,  for  a2  and  a3  both  to  be  equal  to  oq,  while  a4,  a5,  ...,  an  may 
be  different  from  cq.  In  this  case  cq  is  called  a  multiple  root,  and  specifically  in 
the  case  we  have  assumed  of  three  equal  roots,  a  triple  root. 
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Sturm’s  Problem  of  the  Number  of  Roots 


Find  the  number  of  real  roots  of  an  algebraic  equation  with  real  coefficients 
over  a  given  interval. 


This  very  important  algebraic  problem  was  solved  in  a  surprisingly  simple 
way  in  1829  by  the  French  mathematician  Charles  Sturm  (1803-1855).  The 
paper  containing  the  famous  Sturm  theorem  appeared  in  the  eleventh  volume  of 
the  Bulletin  des  sciences  de  Ferussac  and  bears  the  title,  “Memoire  sur  la 
resolution  des  equations  numeriques.” 


“With  this  major  discovery,”  says  Liouville,  “Sturm  at  once  simplified  and 
perfected  the  elements  of  algebra,  enriching  them  with  new  results.” 

Solution.  We  distinguish  two  cases  : 

I.  The  real  roots  of  the  equation  in  question  are  all  simple  over  the  given 
interval. 

II.  The  equation  also  possesses  multiple  real  roots  over  the  interval. 

We  will  first  show  that  the  second  case  leads  us  back  to  the  first. 

Let  the  prescribed  equation F(x)  =  0  have  the  distinct  roots  a,  ft,  y,  . . .,  and  let 
the  root  a  be  a-fold,  /?  6-fold,  y  c-fold,  . . .,  so  that 

F{x)  = 

For  the  derivative  F(x)  of  F(x)  we  obtain 

F(x)  a  b  c 

F(x)  x-ax-fix-y 

_  a(x  —  fi)(x  —  y)(x  —  &)•  ♦  •  +  b(x  —  a)(x  —  y)(x  —  5) - h  • » • 

(*-ec)(x-0)(*-y).-- 

If  we  then  call  the  numerator  of  this  fraction  p{x)  and  the  de  nominator  q{x) 
and  set  the  whole  rational  function  F{x)lq{x)  equal  to  G(x),  then 

F(x)  =  G(x)-q(x)  and  F'(x)  =  G(x)-p(x). 

Now  the  functions  p(x )  and  q(x)  have  no  common  divisor.  (The  factor  x  -  fi  of 
q(pc)  may,  for  example,  go  into  all  the  terms  of  p(x)  except  the  second  with  no 
remainder.)  It  follows  from  this  that  G(x )  is  the  greatest  common  divisor  of  F(x) 
and  F'(x ).  This  can  be  determined  easily  from  the  divisional  algorithm  and  can 
therefore  be  considered  known,  as  a  result  of  which  q{x)  is  known  also. 

The  equation  F(x)  =  0  then  falls  into  the  two  equations 

q(x)  —  0  and  G(x)  —  0, 

the  first  of  which  possesses  only  simple  roots,  while  the  second  can  be  further 
reduced  in  the  same  way  that  F(x)  =  0  was. 

An  equation  with  multiple  roots  can  therefore  always  be  transformed  into 
equations  (with  known  coefficients)  possessing  only  simple  roots. 

Consequently,  it  is  sufficient  to  solve  the  problem  for  the  first  case.  Let  f{x)  = 
0  be  an  algebraic  equation  all  of  whose  roots  are  simple.  The  derivative /(x)  of 
f{x)  then  vanishes  for  none  of  these  roots  and  the  highest  common  divisor  of  the 


functions  fix)  and  f(x)  is  a  constant  K  that  differs  from  zero.  We  use  the 
divisional  algorithm  to  determine  the  highest  common  divisor  of^(x)  and  f(x), 
writing,  for  the  sake  of  convenience  in  representation,^*')  and  f\ (x)  instead  of 
f{x)  and  /(*),  and  calling  the  quotients  resulting  from  the  successive  divisions 
g0(x),  q\(x),  q2(x),.. .  and  the  remainders  -/2(x),  -/3(x),  .... 

If  we  also  drop  the  argument  sign  for  the  sake  of  brevity,  we  obtain  the  following 
scheme: 


(0) 

1 

s 

II 

(1) 

fi  —  ?i/a  ""/a* 

(2) 

1 

II 

In  this  scheme  there  must  at  last  appear — at  the  very  latest  with  the  remainder 
K — a  remainder  -fs(x)  that  does  not  vanish  at  any  point  of  the  interval  and 

consequently  possesses  the  same  sign  over  the  whole  interval.  Here  we  break  off 
the  algorithm.  The  functions  involved 


/o, /.,/*,  •  •  'if* 


form  a  “ Sturm  chain ”  and  in  this  connection  are  called  Sturm  functions. 

The  Sturm  functions  possess  the  following  three  properties: 

1.  Two  neighboring  functions  do  not  vanish  simultaneously  at  any  point  of 
the  interval.  2.  At  a  null  point  of  a  Sturm  function  its  two  neighboring  functions 
are  of  different  sign.  3.  Within  a  sufficiently  small  area  surrounding  a  zero  point 
of /0(x),/|(x)  is  everywhere  greater  than  zero  or  everywhere  smaller  than  zero. 

Proof  of  1.  If,  for  example,  it  2  and /3  vanish  at  any  point  of  an  interval,^ 
[according  to  (2)]  also  vanishes  at  this  point,  and  consequently  f5  also  [according 
to  (3)],  and  so  forth,  so  that  finally  [according  to  the  last  line  of  the  algorithm]/^ 
also  vanishes,  which,  however,  contradicts  our  assumption. 

Proof  of  2.  If  the  function  /3  vanishes  at  the  point  o,  for  example,  of  the 
interval,  then  it  follows  from  (2)  that 


/.(*)  =  -/.(*). 


Proof  of  3.  This  proof  follows  from  the  known  theorem:  A  function  [/o(x)] 
rises  or  falls  at  a  point  depending  on  whether  its  derivative  [/)  (x)]  at  that  point  is 


greater  or  smaller  than  zero. 

We  now  select  any  point  x  of  the  interval,  note  the  sign  of  the  values  f0 (x), 
...,fs(x) and  obtain  a  Sturm  sign  chain  (to  obtain  an  unequivocal  sign, 

however,  it  must  be  assumed  that  none  of  the  designated  s  +  1  function  values  is 
zero).  The  sign  chain  will  contain  sign  sequences  (+  +  and  — )  and  sign  changes 
(+  -  and  -  +). 

We  will  consider  the  number  Z(x)  of  sign  changes  in  the  sign  chain  and  the 
changes  undergone  by  Z(x)  when  x  passes  through  the  interval.  A  change  can 
occur  only  if  one  or  more  of  the  Sturm  functions  changes  sign,  i.e.,  passes  over 
from  negative  (positive)  values  through  zero  to  positive  (negative)  values.  We 
will  accordingly  study  the  effect  produced  on  Z(x)  by  the  passage  of  a  function 
fv(x)  through  zero. 

Let  k  be  a  point  at  which  fv  disappears,  h  a  point  situated  to  the  left,  and  /  a 
point  to  the  right  of  k  and  so  close  to  k  that  over  the  interval  h  to  /  the  following 
holds  true:  (l)/;(*)  does  not  vanish  except  when  x  =  k;  (2)  every  neighbor  (fv+\, 
fv  _ 1)  off,  does  not  change  sign.  We  must  distinguish  between  the  cases  v  >  0 
and  v  =  0;  in  the  first  case  we  are  concerned  with  the  triplet  fv  _x,  fv,  fv  +  b  in  the 
second,  with  the  pair 

In  the  triplet,  fv_x  and  fv  +  l  possess  either  the  +  and  -  sign  or  the  -  and  +  sign 
at  all  three  points  h,  k,  l.  Thus,  whatever  the  sign  of fv  may  be  at  these  points,  the 

triplet  possesses  one  change  of  sign  for  each  of  the  three  arguments  h,  k,  l.  The 
passage  through  zero  of  the  function  fv  does  not  change  the  number  of  sign 

changes  in  the  chain! 

In  the  pair,f  has  either  the  +  or  -  sign  at  all  three  points  h,  k,  l.  In  the  first 
case,  f0  is  increasing  and  is  thus  negative  at  h  and  positive  at  /.  In  the  second 
case,/o  is  decreasing  and  is  positive  at  point  h,  and  negative  at  /.  In  both  cases  a 
sign  change  is  lost. 

From  our  investigation  we  learn  that:  The  Sturm  sign  chain  undergoes  a 
change  in  the  number  of  sign  changes  Z(x)  only  when  x  passes  through  a  null 
point  of/(x);  and  specifically,  the  chain  then  loses  (with  an  increasing  x)  exactly 
one  sign  change.  Thus,  if  x  passes  through  the  interval  (the  ends  of  which  do  not 
represent  roots  of/(*)  =  0)  from  left  to  right,  the  sign  chain  loses  exactly  as  many 
sign  changes  as  there  are  null  points  ofy(-x)  within  the  interval.  Result: 

Sturm’S  theorem:  The  number  of  real  roots  of  an  algebraic  equation  with 
real  coefficients  whose  real  roots  are  simple  over  an  interval  the  end  points  oj 


which  are  not  roots  is  equal  to  the  difference  between  the  numbers  of  sign 
changes  of  the  Sturm  sign  chains  formed  for  the  interval  ends. 

Note.  The  same  considerations  can  also  be  applied  unchanged  to  the  series 
formed  when  we  multiply  fo,fi,fi,  fs  by  any  positive  constants;  this  series  is 
then  likewise  designated  as  a  Sturm  chain.  In  the  formation  of  the  Sturm 
function  chain  all  fractional  coefficients  are  accordingly  avoided. 

Example  1.  Determine  the  number  and  situation  of  the  real  roots  of  the 
equation  x5  -  3x  -  1  =  0. 

The  Sturm  chain  is 

f0  =  x5  -  3*  -  1,  /,  =  -  3,  f2  =  12*  +  5,  /3  =  1. 

The  signs  of/for  x  =  -  2,  -  1,  0,  +  1,  +  2  are 


X 
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A 

A 

-2 
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+ 

— 

+ 

-1 

+ 
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+ 

+ 

+ 

+  2 

+ 

+ 

+ 

+ 

The  equation  thus  has  three  real  roots:  one  between  -  2  and  -  1,  one  between 
-  1  and  0,  one  between  +  1  and  +  2.  The  other  two  roots  are  complex. 

Example  2.  Determine  the  number  of  real  roots  of  the  equation  x5  -  ax  -  b  = 
0  when  a  and  b  are  positive  magnitudes  and  4 4a5  >  5 5b4. 

The  Sturm  chain  reads 

x*  —  ax  —  b,  5x*  —  a,  4  ax  +  5b,  4  *a6  —  5  sb*. 

For  the  values  x  =  -  oo  and  +  oo  it  has  the  signs 

-  +  -  + 


and 


+  +  +  +  ,  respectively. 

The  equation  has  three  real  and  two  complex  roots. 
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Abel’s  Impossibility  Theorem 


Equations  of  higher  than  the  fourth  degree  are  in  general  incapable  oj 
algebraic  solution. 


This  famous  theorem  was  first  stated  by  the  Italian  physician  Paolo  Ruffini 
(1765-1822)  in  his  book  Teoria  generate  delle  equazioni,  published  in  Bologna 
in  1798.  Ruffini’s  proof,  however,  is  incomplete.  The  first  rigorous  proof  was 
given  in  1826  in  the  first  volume  of  Crelle’s  Journal  fur  Mathematik  by  the 
young  Norwegian  mathematician  Niels  Henrik  Abel  (1802-1829).  His 
celebrated  paper  bore  the  title  “Demonstration  de  1’ impossibility  de  la  resolution 
algebraique  des  equations  generates  qui  depassent  le  quatrieme  degre.” 

The  following  proof  of  Abel’s  impossibility  theorem  is  based  on  a  theorem  of 
Kronecker,  published  in  1856  in  the  Monatsberichte  der  Berliner  Akademie. 

We  will  begin  by  presenting  in  a  short  introduction  the  auxiliary  algebraic 
theorems  necessary  for  an  understanding  of  the  Kronecker  proof. 

A  system  ft  of  numbers  is  called  a  number  group  or  rational  domain  when  the 
addition,  subtraction,  multiplication,  and  division  of  two  numbers  of  the  system 
will  also  yield  a  number  of  the  system.  For  brevity  we  will  call  the  numbers  of 
the  system  ft-numbers.  Two  groups  are  called  equal  when  every  number  of  the 
one  belongs  also  to  the  other.  The  simplest  group  is  that  composed  of  all  rational 
numbers,  the  group  8t  of  rational  numbers  or  the  natural  rationality  domain. 

A  group  ft'  =  ft(cc,  f,y,  . . .)  created  by  the  “substitution  of  the  magnitudes  a,  f, 
y,  ...  in  a  group  ft”  is  understood  to  mean  the  totality  of  all  the  numbers  obtained 
from  the  ft-numbers  and  the  substituted  magnitudes  a,  f,  y,  ...  by  one  or  more 
applications  of  the  four  species,  in  other  words,  the  totality  of  all  the  rational 
functions  of  a,  fi,y,....  whose  coefficients  are  ft-numbers. 

A  function  f(x)  or  an  equation  f(x)  =  0  in  a  group  is  a  function  or  equation 
whose  coefficients  are  numbers  of  the  group.  A  polynomial  in  ft  is  understood  to 
mean  an  integral  rational  function  of  the  variable  x  whose  coefficients  are  ft- 
numbers. 

A  polynomial 


F(x)  =  Axn  +  Bxn~1  +  •  •  • 


or  an  equation 


F(x)  =  0 

in  a  group  ft  is  said  to  be  reducible  or  irreducible  in  this  group  accordingly  as 
F(x)  is  divisible  into  a  product  of  polynomials  of  lower  degree  in  ft  or  not. 

The  function  x2  -  1  Ox  +  7,  for  example,  is  irreducible  in  the  group  whereas 
it  is  reducible  in  the  group  dt(^)  : 

x2  -  lOx  +  7  =  (x  -  5  -  3a/2)(x  -  5  +  3V2). 

Abel’s  lemma:*  The  pure  equation 

xp  =  C 

of  the  prime  number  degree  p  is  irreducible  in  a  group  ft  when  C  is  a  number  oj 
the  group  but  not  the  pth  power  of  a  group  number. 

Indirect  proof.  Let  xp  -  C  =  0  be  reducible,  so  that 

Xp  -  C  =  0(x)9>(x), 

where  T  and  cp  are  polynomials  in  ft,  whose  free  terms  A  and  B  are  ft-numbers. 
Since  the  roots  of  the  equation  xp  =  C  are  r,  rs,  rs2,  . . .,  rsP  ~  l,  where  r  is  one  of 
the  roots  and  e  a  complex  pth  unit  root,  and  the  free  term  of  the  equation  lA(x)  =  0 
or  (p  (x)  =  0,  independent  of  sign,  represents  the  product  of  the  equation’s  roots, 
then,  for  example, 


A  =  r“eM,  B  =  rVf 

Since  p,  and  v  possess  no  common  divisor  (because  p  +  v=p),  there  are  integers 
h,  k  such  that 


ph  +  vk  =  1. 

Thus,  we  obtain  for  the  product  K  of  the  powers  A1'  and  Bk  the  value  rshM  +  kN 
and,  consequently,  the  value  KP  =  rp  =  C  for  the  /?th  power  of  the  ft-number  K.  It 
was  assumed,  however,  that  C  must  not  be  the  pth  power  of  a  ft-number. 


Consequently,  xp  =  C  cannot  be  reducible. 

Schoenemann’S  theorem  ( Crelle’s  Journal,  vol.  XXXII,  1846):  If  the 
integral  coefficients  C0,  Cl5  C2,  . . .,  CN_  x  of  the  polynomial 


/(*)  =  C0  +  Cxx  +  C&*  +  •  •  •  +  +  x11 

are  divisible  by  a  prime  number  p,  while  the  free  term  C0  is  not  divisible  by  p2, 
then  f(x)  is  irreducible  in  the  natural  rationality  domain. 

Indirect  proof.  Let  /be  reducible  so  that /=  *P  •  (p,  with 

iP  *  a0  +  avx  +  a2x2  +  •  •  •  +  am_1xm~1  + 

<p  =  b0  +  bxx  +  b2x 2  +  •  •  •  +  bn  _  _  1  +  x 

According  to  a  theorem  of  Gauss*  the  coefficients  a  and  b  are  here  integers.  We 
multiply  the  expressions  for  'P  and  cp,  obtaining,  by  comparison  with/ 


Cq  —  a0b0, 

C\  —  floh  T 

“  <*0*2  +  <*1*1  T  <*2*0,  CtC. 

Since  C0  is  not  divisible  by  p1,  let  us  say  that  a0  is  divisible  by  p,  in  which  case 
b0  is  not.  Since  and  Cx  and  a0  are  divisible  by  p,  while  b0  is  not,  it  follows  from 
the  second  line  of  our  scheme  that  ax  is  divisible  by  p.  Then  it  follows  according 
to  the  third  line  of  our  scheme,  in  which  C2,  a0,  ax  are  divisible  by  p,  that  a2  is 
also  divisible  by  p,  and  so  forth.  Finally,  we  would  be  able  to  conclude  that  am  = 
1  is  also  divisible  by  p,  which  is  naturally  absurd.  Consequently,  /  cannot  be 
reducible. 

Reducible  and  irreducible  polynomials  play  the  same  role  among 
polynomials  that  composite  and  prime  numbers  play  among  the  integers.  Thus, 
for  example,  every  reducible  polynomial  can  be  divided  in  only  one  way  into  a 
product  of  irreducible  polynomials.  All  of  the  theorems  concerned  here  are  based 
on  the  fundamental  theorem  of  irreducible  functions. 

Abel’s  irredugibility  theorem:*  If  one  root  of  the  equation  f(x)  =  0,  which 
is  irreducible  in  St  is  also  a  root  of  the  equation  F(x)  =0  in  st  then  all  the  roots 
of  the  irreducible  equation  are  roots  of  F(x)  =  0.  At  the  same  time  F(x)  can  be 


divided  by  f(x)  without  a  remainder. 


F(x)  -/(*). *i(*), 

where  F,(x)  is  also  a  polynomial  in  si. 

The  simple  proof  of  this  theorem  is  based  on  the  familiar  algorithm  for 
finding  the  highest  common  divisor  g(x)  of  two  arbitrary  polynomials  F(x)  and 
f(x)  in  st.  This  algorithm  leads  through  a  chain  of  divisions,  in  which  all  the 
coefficients  are  si-numbers,  to  the  pair  of  equations 

m  -i'iw  •/(*),  /w  -/i (*)*(*) 


and  to  the  equation 


V(x)F(x)  +  v(x)f(x)  =  g(x), 

where  all  the  indicated  functions  are  polynomials  in  si. 

If  the  prescribed  functions  F  and  /  have  no  common  divisor,  then  g(x)  is  a 
constant  which  is  for  convenience  set  equal  to  1 . 

If /is  irreducible  and  a  root  a  o ff=  0  is  also  a  root  of  F  =  0,  then  there  exists 
a  common  divisor  of  at  least  the  first  degree  (x  -  a).  Since  /is  irreducible, /(x) 
must  equal  1  and  /(x)  =  g(x),  and  then 


F(x)[=F1(x)-g(x)]=F1(x)-f(x). 

F(x)  is  thus  divisible  by  /(x)  and  vanishes  for  every  zero  point  of/(x).  Q.E.D. 
The  fundamental  theorem  directly  implies  two  important  corollaries: 

I .  If  a  root  of  an  equation  f(x)  =  0,  which  is  irreducible  in  St  is  also  a  root  oj 
an  equation  F(x)  =0  in  si  of  lower  degree  than  f,  then  all  the  coefficients  of  F  are 
equal  to  zero. 

II.  If  f(x)  =  0  is  an  irreducible  equation  in  a  group  si,  then  there  is  no  other 
irreducible  equation  in  si  that  has  a  common  root  with  f(x)  =  0. 

The  commonest  case  of  substitution  in  a  group  si  consists  of  the  substitution 
of  a  root  a  of  an  irreducible  equation  of  the  nth  degree 

f{x)  =  xn  +  o^xn  ~ 1  +  •  •  •  +  an  =  0 

into  si-  A  number  /  of  the  group  si'  =  si(«)  defined  by  this  substitution  is  a 


rational  function  of  a  with  coefficients  from  ft  and  can  be  written  =  (p(a)/0(a), 
where  T1  and  O  are  polynomials  in  ft.  Since  «n  =  -aian_1  -  a2an~2  -  •  -  an, 
every  power  of  a  with  the  exponent  n  or  with  a  higher  exponent  can  be 
expressed  by  the  powers  an_1,  an~2, . . a,  so  that  we  may  write  C,  =  'fr(a)/(p(a), 
where  ^  and  cp  are  polynomials  in  ft  of  no  higher  than  the  (n  -  l)th  degree. 

Since  J[x)  and  TE(x)  possess  no  common  divisor,  two  polynomials  u(x)  and 
v(x)  can  be  found  (see  above)  in  ft,  such  that  u(x)(p(x)  +  v(x)f(x)  =  1 .  If  in  this 
equation  we  set  x  =  a,  then  [since  /(a)  =  0]  u(d)-<p(a)  =  1,  i.e.,  £  =  ^(« )-«(a).  We 
multiply  this  out  and  once  again  eliminate  every  power  of  a  whose  exponent  >/?. 
This  finally  gives  us 


{  =  c0  +  Cxa  +  c2a2  +  •  •  •  + 

where  the  cv  are  ft-numbers;  i.e., 

III.  Every  number  of  the  group  ft(a),  where  a  is  a  root  of  an  irreducible 
equation  of  the  nth  degree  in  ft,  can  be  represented  as  a  polynomial  of  the  (n  - 
1  )th  degree  of  a  with  coefficients  that  are  ix-numbers.  There  is  only  one  such 
possible  way  of  representing  it. 

[From 


Co  +  +  •  •  •  +  - 10"  r  1  “  Cq  +  Cxa  +  •  •  •  +  C„  _  jo"  ~ 1 


it  follows  that 


d0  +  dx  a  +  •  •  •  +  dn  _  jan  “ 1  =  0,  with  dy  =  C,  —  c,. 
Then  the  function  of  the  ( n  -  l)th  degree 


d0  +  dxx  +  +  •  •  •  +  c/n-i*"-1 


vanishes  for  a  root  of/fc)  =  0  and,  according  to  corollary  I.,  must  have  nothing 
but  evanescent  coefficients.  From  dv  =  0,  however,  it  follows  that  Cv  =  cv .] 

We  have  just  seen  a  simple  example  of  an  irreducible  function  that  became 
reducible  by  substitution  of  a  root. 

Let  us  consider  the  more  general  case  in  which  an  irreducible  function  f{x)  in 
ft  of  prime  number  degree  p  becomes  reducible  by  substitution  of  a  root  a  of  an 
irreducible  equation  of  the  c/th  degree  g(x)  =  0  in  ft,  in  which,  therefore,  f{x)  can 
be  divided  into  the  product  of  the  two  polynomials  '/'(x,  a)  and  <p(x,  a),  which 
may  be  of  the  mth  and  /7th  degree  of  respectively. 


Now  the  function  in  ft 


«(*)  =/(')  -  *Mr>  *)» 

where  r  is  some  rational  number,  vanishes  for  x  =  a.  According  to  the 
fundamental  theorem  of  irreducible  functions,  u(x )  is  then  evanescent  for  all 
roots  a,  a',  a”,  ...  of  the  irreducible  equation  g(x)  =  0. 

Since,  for  example,  the  equation 


/(*)  -  <K*>  “')?(*>  «')  =  0 

is  therefore  valid  for  every  rational  x,  it  is  valid  for  all  the  values  of  x,  so  that  by 
identity 


/(*)  -  ip(xt  a) 

and  similarly  for  all  other  roots  of  g(x)  =  0. 

From  the  q  equations 


/(*)  =  «M*» «)» 

f(x)  «  a')<p(x,  a),  etc., 

thus  obtained,  it  follows  by  multiplication  that 

/(*)«  =  ▼(*)•♦(*), 

where  ^(x)  and  O(v)  are  the  products  of  the  q  polynomials  ^(x,  a),  '/'(x,  a’),  . . . 
and  (p(x,  a),  (p(x,  a'),  ...,  respectively.  Since  each  of  these  products  is  a 
symmetrical  function  of  the  roots  of  g(x)  =  0,  each  product  can  be  expressed 
rationally  according  to  the  Waring  theorem  by  the  coefficients  of  g(x)  =  0  [and 
naturally  by  x],  so  that  *F  (x)  and  O(x)  are  polynomials  in  ft 

Now  vF(x)  certainly  vanishes  for  at  least  one  root  of  the  irreducible  equation 
f(x)  =  0,  as  does  O(x).  Consequently  both  vF(x)  and  <D(x)  can  be  divided  without  a 
remainder  by/(x),  and  since  /is  irreducible  no  other  divisor  than  /is  possible,  as 
a  result  of  which 


▼(*)  =/WM.  •(*)  -/(*)’, 


with  pi  +  v  =  q.  Comparing  the  degree  of  the  left  and  right  sides,  we  obtain 


mg  =  up,  nq  —  vp 


and  from  these,  since  m  and  n  are  smaller  than  p,  it  follows  that  p  is  a  divisor  of 
q.  We  therefore  obtain  the  theorem: 

IV.  An  irreducible  equation  of  the  prime  number  degree  p  in  a  group  can 
become  reducible  through  substitution  of  a  root  of  another  irreducible  equation 
in  this  group  only  when  p  is  a  divisor  of  the  degree  of  the  latter  equation. 

After  this  introduction  we  can  turn  to  the  proof  of  Abel’s  theorem.  First, 
however,  we  will  consider  what  is  meant  by  an  algebraically  soluble  equation. 

An  equation  of  the  nth.  degree  f[x)  =  0  in  a  group  gt  is  called  algebraically 
soluble  when  it  is  soluble  by  a  series  of  radicals,  i.e.,  when  a  root  w  can  be 
determined  in  the  following  manner: 

1.  Determination  of  the  nth  root  a  =  \/r  of  an  gt-number  R,  which  is  not, 
however,  an  nth  power  of  an  gt-number,  and  substitution  of  a  into  gt,  so  that  the 
group  St  =  SR(a)  is  formed; 

2.  Determination  of  the  Z?th  root  ft  =  2  of  an  21  -number  A,  which,  however, 

is  not  a  bth  power  of  an  2t -number,  and  substitution  of  f  into  a,  so  that  the  group 
93  =  'H(P)  =  gt(«,  p)  is  formed; 

3.  Determination  of  the  cth  root  y  =  Vb  of  a  ■©-number  B,  which,  however,  is 
not  a  cth  power  of  a  ©-number,  and  the  substitution  of  y  into  ©,  so  that  the  group 
(S  =  ©(y)  =  g?(a,  p,  y)  is  formed,  etc.,  until  these  successive  substitutions  of 
radicals  a,  ft,  y,  ...  at  length  result  in  a  group  to  which  w,  the  sought-for  root, 
belongs  and  in  which  f(x)  [since  it  possesses  the  divisor  x  -  w]  becomes 
reducible.  It  is  here  assumed  that  all  the  radical  exponents  a,  b,  c,...  are  prime 
numbers.  This  does  not  represent  a  restriction  since  any  extraction  of  roots  with 
composite  exponents  can  be  reduced  to  successive  extractions  of  roots  with 
prime  exponents  (e.g.,  V'ii  =  Vo with  v  =  Vu). 

In  order  to  shorten  our  task  somewhat,  we  will  limit  ourselves  to  equations 
f(x)  =  0  which  possess  rational  coefficients,  so  that  gt  is  the  natural  rationality 
domain,  which  are,  moreover,  irreducible  in  gt,  and  which  are  of  the  degree  n, 
which  is  an  odd  prime  number. 

Let  the  first  substitution  be  that  of  the  nth  root  of  unity 

n/J  2n  .  2n 

a  =  7]  =  V  1  =  cos  —  4-  t  sin  — * 
n  n 

According  to  IV.,  this  substitution  still  does  not  make  / reducible,  since  q  is  a 
root  of  the  equation  *n_l  +  xn~2  +  ■  •  •  +  x  +  I  =  0,  the  degree  of  which  is  <  n. 


Also,  with  each  substituted  radical  of  our  series,  which  still  does  not  allow 
division  of/(v),  we  will  also  substitute  at  the  same  time  the  complex  conjugate 
radical.  Though  this  may  be  superfluous,  it  can  certainly  do  no  harm. 

Let  X  =  J/K.  be  the  radical  the  addition  of  which  to  the  preceding  radicals 
makes  j[x)  reducible,  so  that  fix)  is  still  indivisible  in  the  group  ft  (to  which  the 
number  K  belongs),  but  becomes  divisible  in  £  =  ft(A): 


/(*)  =  0(*»  X)<p(x,  \)-X(x,  A).... 


Here  the  factors  4*,  (p,  X,  ...  are  irreducible  polynomials  in  £  (but  naturally  not 
polynomials  in  ft)  whose  coefficients  are  polynomials  of  X  in  ft 

Since,  according  to  IV.,  the  prime  number  n  must  be  a  divisor  of  the  prime 
number  /,  /  must  be  equal  to  n. 

The  /  roots  of  the  equation  xl  =  K,  which  is  irreducible  in  ft  according  to 
Abel’s  lemma,  are 

A0  =  A,  Ax  =  A r/,  A2  =  \t)2,  . . A,  =  Aip, . . A,.!  =  At?"-1. 

Since  tix,  X)  is  a  divisor  of  fix),  then  *A(w,  Xv)  also  goes  into  f{x)  without  a 
remainder  (cf.  the  proof  of  IV.). 

Every  one  of  the  n  functions  4>{x,  Xv )  is  irreducible  in  £. 

[As  in  the  proof  of  IV.,  it  follows  from  <£(*,  A„)  =  u(x,  Av)-»(x,  Av)  that 

A)  =  u(x.  A)  -v(x,  A),  but  this  equation  is  impossible  because  '/'(x,  X)  is 
irreducible  in  £.] 

No  two  of  the  n  functions  '/'(x,  Xv )  are  equal.  [In  4>{x,  Xrju)  -  <fi(x,  A rfv),  A  could,  as 
before,  be  replaced  by  the  root  Xrjn  from  which  it  would  follow  that 

+(X)  A)  =  Mx,  A H), 

where  H  represents  the  root  of  unity  rf  ~ ft.  Here  X  could  in  turn  be  replaced  by 
XH,  which  would  give 


*(*,  m  =  A//2). 


Similarly,  it  would  follow  that 


*(*,  A^2)  =  0(x,  A//3), 


etc.  Thus,  we  would  then  have 


*(*,  A)  =  *(*,  XH)  =  *(x,  A//2) 


i.e.,  also 


^  A)  =  A)  +  0(x,  A//)  +  •  •  •  +  flx,  A #"-*) 

The  right  side  of  this  equation,  however,  as  a  symmetrical  function  of  the  n  roots 
A,  A H,  A H2, . . .  of xn  =  K,  is  a  polynomial  of  x  in  ft,  so  that  ^(x,  X)  would  also  be  a 
polynomial  of  x  in  ft.  This,  however,  contradicts  what  was  stipulated  above 
concern  ing/(x).] 

For  these  two  reasons  it  follows  that  f[x)  is  divisible  by  the  product  T^x)  of 
the  n  different  factors  <P(*>  A),  A??),  •  •  •>  0(*>  Ai?”-1)  that  are  irreducible  in  £: 

/to  - 

where  VF  (as  a  symmetrical  function  of  the  roots  of  xn  =  K),  and  consequently  U 
as  well,  are  polynomials  of  x  in  ft.  Now,  since  f{x)  is  not  reducible  in  ft,  U(x) 
must  equal  1  and  necessarily 


/to  =  ▼(*)  =  +(*,  m*,  Ai?)...  +(x.  A,*-1). 

The  postulated  divisibility  ofJ[x)  for  the  group  £  consequently  reveals  itself 
as  a  divisibility  into  linear  factors.  Thus,  if  co,  coh  co2,  ...,con_\  are  the  roots  and 
x  -  co,  x  -  cox,  ...,  x  -  con_  1  are  the  linear  factors  of/fc),  then 

X  -  U>  =  tP(x,  A),  x  u)1  —  X ,  Aij), . . .  X  -  wn.l  *  Aij"-1), 


and  consequently 


W  =  K0  +  K,  A 

+  k2  a2  +  ... 

+ 

••  K0  +  Ki  Aj 

+  A"aAJ  +  •  • 

+  Ab_jAJ  l, 

i-i  =  +  ^iAn-i 

+  AjA2  j  +  •  •  • 

+  *»  —  1  AH  —  i » 

where  all  the  Kv  are  ft-numbers. 

Now  the  equation  J[x)  =  0  has  at  least  one  real  root,  since  it  is  of  an  odd 


degree.  Let  this  real  root  be 


a>  «  K0  +  tfjA  +  .  •  •  +  JC..,  A—1. 

We  distinguish  two  cases: 

I.  The  base  K  of  the  reducible  radical  X  is  real; 

II.  the  base  K  is  complex. 

Case  I.  Here  we  can  assume  that  A  is  real,  since  the  nth  roots  of  unity 
belong  to  the  group  ft.  In  that  event  the  complex  conjugate  of  co  is 

(5  = 

where  the  complex  conjugates  K  of  Kv  are  also  ft-numbers.  From  a,  =  co  it 
follows  then  that 


(K0  -  K0)  +  (K,  -  K,) A  +  ...  +  A->  =  0, 


and  from  this,  taking  theorem  I  into  consideration,  it  follows  that  K  =  Kv  for 
every  v.  The  magnitudes  K0,  Kh  ...,Kn_l  are  therefore  also  real. 

Furthermore, 


ojv  =  K0  +  Ki  Av  +  •*.  +  -  i^r  1 

and 

u,.,  =  K0  -f  /TjAn_,  +  ■  •  •  + 

However,  since  Av  =  Arjv  and  An_v  =  A7/n"v  =  Aij-V  are  complex  conjugates,  it 
follows  that  cov  and  con_v  are  also  complex  conjugates,  i.e.: 

The  equation  f(x)  =0  possesses  one  real  root  and  n  -  1  paired  conjugate 
complex  roots  {coi  and  con  _  b  co2  and  con_2,  etc.). 

Case  II.  In  this  case  we  substitute,  in  addition  to  the  reducible  radical 
A  =  V~R,  the  complex  conjugate  \  _  with  the  result  that  the  real  magnitude  A 
=  AA  is  also  substituted. 

If  the  substitution  of  A  =  alone  (i.e.,  without  X)  were  sufficient  to  make 
f{x)  reducible,  this  would  give  us  the  situation  of  Case  I.  We  may  therefore 
assume  that  j{x)  is  still  irreducible  in  ft(A)  and  does  not  become  reducible  until 
the  additional  substitution  of  A 


From 


u)  =  K0  +  Kx\  +  •  •  •  +  Km.t  A*'1 


it  follows  that 


<5  =  Kq  +  K\\  +  •••  +  1 


and  from  this,  since  &  =  co,  that 


K0  +  Kx  A+  ..  +  X.-xA1*-1 


+  •  •  +  K*-i 


In  this  equation  all  of  the  magnitudes  with  the  exception  of  X  belong  to  the  group 
ft(A),  and  since  the  equation  xn  =  K  (according  to  Abel’s  lemma)  is  irreducible  in 
this  group,  we  are  able  to  replace  X  in  the  above  equation  by  any  root  Xv  of  xn  = 


K. 


If  we  do  this  and  keep  in  mind  that 


we  obtain 


K0  +  KxX,  +  •  •  ■  +  tfn.jA?-1  =  +  •  ■  •  + 


or 


Thus,  all  the  roots  of  f(x)  =0  are  real. 

The  combination  of  the  results  of  I.  and  II.  yields  the 

Kronecker*  theorem:  An  algebraically  soluble  equation  of  an  odd  degree 
that  is  a  prime  and  which  is  irreducible  in  the  natural  rationality  domain 
possesses  either  only  one  real  root  or  only  real  roots. 

Kronecker ’s  theorem  proves  at  the  same  time  that  an  equation  of  higher  than 
the  fourth  degree  cannot  be  solved  generally  by  algebraic  means. 


The  simple  fifth-degree  equation 


x6  —  ax  —  b  =  0, 

for  example,  cannot  be  solved  algebraically  when  a  and  b  are  positive  integers 
that  are  divisible  by  a  prime  number  p ,  b  is  indivisible  by  p 2,  and  when  4 4a5  > 
5  5b4. 

According  to  Schoenemann’s  theorem  the  equation  is  irreducible.  Sturm’s 
theorem  (No.  24)  proves  that  it  possesses  three  real  roots  and  two  complex  roots. 
Consequently,  the  equation  is  algebraically  insoluble  according  to  Kronecker’s 
theorem. 

In  exactly  the  same  way  it  can  be  shown  that 

x7  —  ax  —  b  =  0 

is  algebraically  insoluble  when  6 6aj  >  7766,  etc. 


26 


The  Hermite-Lindemann  Transcendence  Theorem 


The  expression 


A^£°t.  q-  Aqf**  q*  Ajt0*  q-  •  •  •, 


in  which  the  coefficients  A  differ  from  zero  and  in  which  the  exponents  a  are 
algebraic  numbers  differing  from  each  other,  cannot  equal  zero. 

This  extremely  important  theorem  (see  below)  was  proved  in  1882  by  the 
German  mathematician  Lindemann  (in  the  Berliner  Sitzungsberichte)  after  the 
French  mathematician  Hermite  (1822-1901),  in  vol.  77  of  the  Comptes  rendus  in 
1873,  had  proved  the  special  case  in  which  the  coefficients  and  exponents  were 
rational  integers.  Lindemann ’s  proof,  which  required  a  great  many  higher 
mathematical  tools,  was  simplified  to  such  an  extent,  first  (1885,  Berliner 
Sitzungsberichte)  by  K.  Weierstrass  (1815-1897),  then  (1893,  Mathematische 
Annalen,  vol.  43)  by  P.  Gordan  (1837-1912),  that  the  proof  is  now  generally 
accessible.  The  proof  is  presented  here  essentially  in  the  form  given  to  it  in  his 
textbook  of  algebra  by  H.  Weber  (1842-1913). 

The  proof  is  indirect.  We  assume  that  there  are  /  algebraic  numbers  Ax,  A2, 
...,  A/  and  /  algebraic  numbers  cq,  a2,  . ..,  «/  differing  from  one  another  that 
satisfy  the  equation 


(1) 


A^i  +  A2f*  +  •  •  •  +  Afi  *  0, 


and  we  show  that  this  assumption  leads  to  a  contradiction.  The  demonstration  is 
divided  into  four  steps. 

1.  We  consider  the  coefficients  A  as  roots  of  a  real  equation  21  (x)  =  0  with 
rational  coefficients  the  degree  of  which,  L,  will  generally  be  greater  than  /.  Let 
the  roots  of  this  equation  be  Ax,  A2,  Ab  ...,  AL.  We  form  all  the  possible  /- 
termed  expressions  /!/“•  +  A/1*  +  •  •  ■  [totaling  L(L  -  1  )(Z  -  2)  ...  (L  -  l  +  1) 
elements],  where  An  As,  ...  are  any  /  components  of  the  series  Ah  A2,  . . .,  AL,  and 
we  multiply  these  expressions  together,  always  combining  each  of  the  members 
with  the  same  exponential  factor  e*.  The  resulting  product  has  the  form 

IT  =  A\eBi  +  A^e**  +  •  •  •  +  A^e**, 


where  the  A'  are  nonevanescent  magnitudes. 

[That  the  coefficients  A'  obtained  by  multiplying  out  and  combining  cannot 
all  vanish  is  proved  in  the  following  manner.  We  call  the  first  of  the  two 
complex  numbers  x  +  iy  and X+  iY the  “smaller”  when  either  x  <X or  x  =  X if  y 
is  at  the  same  time  <  Y.  Now  the  product  IT  consists  only  of  factors  of  the  form 
Fv  =  P*ev*  +  Q +  ■  • .,  where  none  of  the  coefficients  P,  Q,  R  vanishes, 
and  we  can  consider  the  terms  as  being  arranged  in  such  a  manner  that 
pf  <  qy  <  rv  <  •  ■  On  multiplying  the  factors  Fv  the  exponent p{  +  P2  +  P3  +  ... 
of  the  first  term  obtained  is  then  the  smallest  of  all  the  exponents  obtained  and 
occurs  only  once.  Consequently,  at  the  very  least  the  first  term  of  the  multiplied- 
out  product  differs  from  zero,  which  was  what  we  set  out  to  prove.] 

The  coefficients  A'  are  not  changed  by  transpositions  of  the  magnitudes  Ah 
A2,  ...,  Al;  in  other  words,  they  are  symmetrical  functions  of  the  roots  of  21  (x)  = 
0,  and,  therefore,  according  to  the  principal  theorem  concerning  symmetrical 
functions,  are  rational  numbers. 

Since  the  left  side  of  (1)  is  also  among  the  factors  of  IT, 

rr  =  o. 

We  multiply  this  equation  by  the  common  denominator  of  the  A'” s  and  obtain  the 
new  equation 

(2)  +  •  •  •  +  5,,^"  =  0, 

where  the  /?  different  algebraic  numbers  and  the  coefficients  B  are  nonevanescent 


rational  integers. 

II.  Let  us  consider  the  exponents  /?  as  roots  of  an  algebraic  equation  93(x)  =  0 
with  rational  coefficients  of  degree  M,  with  M  generally  greater  than  m,  and  let 
us  in  the  usual  way  think  of  the  equation  as  being  free  of  identical  roots.  We 
form  the  M(M -  1  )(M -  2)  ...  (M -  m  +  1)  ra-termed  sums 

Biev>  r  +  Baevt>  +  •••, 

where  v  is  a  variable  and  fi>n  fis,  ...  are  any  m  roots  of  93  (x)  =  0,  and  multiply 
these  sums  by  each  other,  once  again  combining  terms  with  the  same  exponential 
factor  e*.  The  resulting  product  has  the  form 

n  =  cxf  »*  +  Caer2v  +  .  • .  +  CKe'*\ 

where  the  coefficients  C  are  nonevanescent  rational  integers  and  y  represents 
different  algebraic  numbers. 

The  product  II  is  a  symmetrical  function  of  the  roots  of  93(x)  =  0. 
Consequently,  the  coefficients  of  the  expansion  of  II  according  to  the  powers  of 
v  are  also  symmetrical  functions  of  those  roots;  thus,  for  example,  the  coefficient 
kv  of  vv : 


K  =*  (Ciy*  +  C2yl  +  •  •  •  +  cyn)/v\. 

Every  coefficient  kv  is  therefore  a  rational  number.  Accordingly,  if  g(x)  is  a 

l.n 

rational  function  of  x  with  coefficients  that  are  rational  integers,  the  sum  C,g(y,) 

$ 

is  rationally  composed  of  the  coefficients  kv  and  is  consequently  a  rational 
number. 

Now  since  the  product  n  for  v  =  1  contains  the  factor  B^  1,  B2e*i  +  ■  •  •  +  Bnt? « 
,  which  is  equal  to  zero  according  to  (2),  the  product  for  v  =  1  is  also  equal  to 
zero,  and  we  obtain  the  equation 

(3)  Cxey  1  +  <y*  +  •••  +  Cney*  =  0, 

in  addition  to  which  for  every  integral  rational  function  g(x)  with  integral  rational 
coefficients 

(3a)  Cifl(yi)  +  Qfl(ya)  +  •  •  •  +  C»0(y») 


is  a  rational  number. 


III.  We  consider  the  exponents  y1,  y2,  ...,  yn  as  roots  of  an  algebraic  equation 

x*  +  ~ 1  +  r2xN  ~ a  +  •  •  •  +  rN  =  0 

with  rational  coefficients  of  degree  N^n,  possessing  no  identical  roots. 

We  multiply  this  equation  by  the  Mh  power  of  the  common  denominator  H  of 
the  coefficients  rhr2,...  and  obtain 

(. Mr)"  +  Hr1(Hx)N~l  +  Har2(Hx)N  ~a  +  ...  =  0 
or,  if  we  write  X  instead  of  Hx  and  call  the  integers  Hrx ,  H2r2,  H3r3,  ...,  g\,  g2,  g3, 


f(X)  «**+  glX"-'  +  gaX"-a  +  . . .  +  fy  -  0. 

If  rl5  r2,  . . are  the  roots  of  this  equation,  then 

f(X)  =  (x-  rjix -r2)...(x -  rw). 

The  roots  T  possess  the  n  values  T x  =  Hyu  ra  =  Hy2, . . .,  rn  =  Hyn. 

Since  T  represents  integral  algebraic  numbers,  then,  as  a  result  of  (3a), 

(34)  c,o(r1)  +  cafl(ra)  +  •  •  •  +  c.fl(rm) 


is  a  rational  integer. 

Besides  f(X)  we  will  consider  the  function 


9(X)  = 


f(X)  .  f(X)  ,  f(X) 

x  -  Ti  x  -  r2  ^  '  ^  x  -  rs 
( x  -  r2)(x -r3)...(x-  r*) 

+  (jr- raKJr -  r3)(x -r<)...(x 

NX"-1  +  NiX"-*  +  •  •  •, 


-  r„)  + 


which  is  not  evanescent  for  any  of  the  values  Fi,  r2, Tn,  and  the  coefficients 
of  which  N,NX,  ...  (as  symmetrical  functions  of  the  roots  ThT2,...,TNoff{X)  = 
0)  are  rational  integers. 

If  the  sum 


Ci9>(r,)  +  ca9>(ra)  +  ...  +cxr„) 


should  by  chance  equal  zero,  we  select  the  positive  integral  exponent  h{<  n)  in 
such  a  manner  that  the  (integral)  sum 

g  =  c,i>( r,)  +  car59(ra)  +  . .  •  +  cBr>(rB)  *  o. 


[Such  an  exponent  must  exist,  because  otherwise  the  n  linear  homogeneous 
equations 


1 

•*i  +  i 

•*a  +  •  •  • 

+  1 

•*«  =  o, 

r, 

•*i  +  ra 

•  x2  +  •  •  • 

+  r. 

•*»  =  0, 

r? 

+  n 

■x2  +  •  •  • 

+  n 

=  0, 

rr 

+  ir 

l-x2  +  •  •  • 

+  rr 

o 

••  II 

H 

would  exist  for  the  n  nonevanescent  “unknowns”  x1  =  cl9>(r1),...,xB  =  cn9(rB). 
This,  however,  is  impossible,  since  then  the  determinant 

1 

r? 

rr 

of  the  equation  system  would  have  to  disappear;  however,  this  determinant 
represents  the  product  of  all  the  differences  Tr  -  Ts,  in  which  r  >  s,  and,  in 

accordance  with  the  above,  none  of  which  disappear.] 

IV.  Now  we  put  the  fundamental  property  of  the  exponential  function — the 
series  expansion  for  ez — into  the  form  most  suited  for  our  proof. 

This  is 


1 

ra 

n 

rs1 


1 

r, 

r! 


pjj-i 


♦*-  l+x  +  5\  +  •••  +^T  + 


We  multiply  this  equation  by  FFvl  and  obtain  (Hx  =  X) 


e*v\H' 


H'v\  +  vH'-l[y  -  1)!A"  +  v2H'~2{v  -  2) \X2  + 
r»  +  y»[  x  a.  _ _ 

‘  +  At+1  (y  +  Ij£  +  2) 


+  X' 


+ 


In  order  to  write  this  formula  more  conveniently,  we  introduce  the  symbol  <&, 


which  will  be  defined  by  the  following  direction: 

A  function  F(<5)  shall  be  considered  the  expression  obtained  when  F(<5),  on 
the  assumption  that  ©  is  a  number,  is  transformed  in  the  usual  way  into  a  power 
series  of  ©  and  ©v  is  replaced  by  vUF  at  the  end  of  expansion. 

Our  formula  can  then  be  written  in  the  simple  form: 

=  (6  +  xy  +  X'[  ]. 

If  we  then  designate  the  absolute  magnitude  of  x  as  £,  the  absolute  magnitude  of 
1  ]  is  smaller  than 


0  —  —  x  -  - .  x  •  •  • 

V  +  1  +  l)(^  +  2)  +  * 

and  therefore  certainly  smaller  than 

1  +  (  +  f]  ^  *  * '  =  **' 

If  s  is  understood  to  be  a  magnitude  the  absolute  value  of  which  is  a  proper 
fraction,  we  therefore  obtain 

(4)  «*©*  =  (x  +  sy  +  <*«r. 

We  will  immediately  extend  this  somewhat  further.  Let 

V(X)  =  Xk  +  +  K2Xk~ 2  +  ...  +  Kk 

represent  an  integral  rational  function  of  A  with  integral  rational  coefficients.  We 
form  (4)  for  v  =  k,  k  -  1,  k -  2,  . . .,  multiply  the  resulting  equations  by  1,  Kh  K2, 

. . .,  and  add.  This  gives  us 

(5)  #*F(@)  =  V(X  +  ©)  +  e'V(X), 
with 

(5a)  V{X)  -  t0xk  +  elKlXk~l  +  ,2KaXk~ a  +  . . 

where  the  absolute  values  of  the  magnitudes  eK  are  proper  fractions. 

If  A1?  A2,  ...  represent  the  roots  of  V(X)  =  0  and  d  represents  the  greatest  of 
the  lvalues  |A|  +  |A^|,  it  follows  from 


V(X)  =  (X-  ^)(X  -  Aa) . . . 


that  the  absolute  magnitude  of  V(X)  [like  that  of  V(X)]  is  smaller  than  dk: 
(5b)  \V(X)\<dK 

We  apply  the  results  (5),  (5a),  (5b)  to  the  function 

V(X)  =  F(*)«D(*), 


in  which 


F(X)  =  X*f(X),  «>(*)  -  X\(X), 

q  =  p  -  1 ,  and  p  is  a  preliminarily  selected,  still  undetermined  prime  number. 
Since  the  degree  of  F(X)  is  h  +  N,  and  the  degree  of  0(A)  is  h  +  N  -  1,  V(X)  is  of 
the  degree  k=(h  +  N)q  +  h  +  N-  1 . 

Equation  (5)  is  now  transformed  into 

=  V(X  +  ©)  +  ee*d\ 

where  d  is  the  greatest  of  the  k  values  \X\  +  |  A^|  and  e  is  a  number  whose  absolute 
magnitude  is  a  proper  fraction. 

We  now  choose  for  x  and  X  the  values  yv  and  Tv,  respectively  (v  is  any  one  of 
the  numbers  1  to  n).  Then  c  is  the  absolute  magnitude  cv  of  yv  and  d  =  dv  is  the 
greatest  of  the  k  sums  \TV\  +  |A^|. 

If  D  then  represents  the  greatest  of  the  2 n  numbers  d and  e?»d?**h>~1,  then 
the  improper  fraction  Djd^h  is  >***</? +*-1,  and  consequently 

(Did”")'  Z 


or 


D*  Z  gt’d* 

must  be  true,  and  we  obtain  the  somewhat  simpler  formula 

(6)  <y«F(©)»  F(I\  +  ©)  +  rjJ)*, 

where  \rjv\  <  1. 


The  expansion  of  V(Tv  +  <5)  according  to  the  powers  of  ®  gives  us 

K(TV  +  ®)  =  l/'oS*  +  01®**1  +  02 3* *  2  +  •••, 

where  the  coefficients  0  are  integral  rational  functions  of  Tv  with  integral  rational 
coefficients.  In  particular, 

0o  =  wr.)  -  o(r ,)*. 

[For  v  =  1,  for  example, 

F( rx  +  ®)  =  (i\  +  6)*[©  (@  +  rx  -  ra).(®  +  r4  -  r8)...] 

=  ri(i\  -  ra)(r1  -  -  r„).®  +  ... 

=  ixrx).©  + ... 

and 

o(rj  +  ®)  =  (rx  +  ®)V(r»  +  @)  =  +  •••» 

consequently 

F(i\  +  ®)  =  r?v(r1)p.©*  +  ...  =  oiih)1’-®*  +  •••.] 

If  we  introduce  this  expansion  into  (6),  we  finally  obtain 

eyy v (®)  -  0o(rv)s«  +  0x(r,)S'  +  02(I\)S’*1  + 

This  formula,  multiplied  by  Cv,  we  then  form  for  all  v  from  1  through  n,  and 
we  add  the  resulting  n  equations. 

According  to  (3),  we  then  obtain 

(7)  0  -  G0<BQ  +  <A®P  +  Ga®’  +  1  +  •  •  •  +  Ck®«**  +  A D*, 

where 

Gr  =  cxm  rx)  +  c^r,)  +  . . .  +  c.0,(rm) 

is,  according  to  (3b),  a  rational  integer  and  X  is  a  number  the  absolute  magnitude 
of  which  does  not  exceed  the  n-fold  value  of  the  maximum  |C|-value. 

We  now  replace  ®r  with  Hrr\,  divide  (7)  by  the  then  universally  common 
factor  Hq,  abbreviate  D\H  as  E,  and  combine  all  the  terms  containing  the  factor 


p\,  and  we  obtain 

(8) 


G0?!  +  G'p\  =  A  £«, 


where  G'  is  an  integer  and  A  =  -  X. 

Now  we  compare 

G0  =  C10(r1)p  +  ca<D(ra)'  +  . . .  +  c^r.)* 


with 


g  =  CjOtr,)  +  caa>(ra)  +  .  •  •  +  cB<D(rm), 


the  latter  of  which,  according  to  our  assumption  concerning  h,  differs  from  zero. 

If  we  expand  GP  according  to  the  polynomial  theorem,  every  term  of  the 
expansion,  with  the  exception  of  the  n  terms  cyO(Fv)p:,  is  the  /^-multiple  of  an 
integral  algebraic  number,  and,  therefore, 

(9)  Gp  =  [CJ4>(  r,)'  +  • .  •  +  Cp<I>(ril)pI  + 

where  p  is  an  integral  algebraic  number  (which  is,  in  fact,  integral  and  rational). 

Now  according  to  Fermat’s  theorem*  every  difference  C?  -  Cv>,  as  well  as  CP 
-  G,  is  an  integral  multiple  cvp  and  gp,  respectively,  of  p.  Accordingly,  (9)  is 

transformed  into 

c  +  gp  =  (c,  +  c1p)<I>(^1),,  +  •  •  •  +  (c.  +  cnp)<t>(rny  +  & 

—  f i<i>(r'i)p  +  •  •  •  +  c.o(rll)p  +  pp  **  g0  +  fi'p, 

where  p'  is  also  integral  and  algebraic. 

This  equation  simplifies  into 


G0  =  G  +  g'p, 


where  g'  =  g  -  p'  is  an  integral  algebraic  number,  and  is  also  an  integral  rational 
number,  as  a  result  of  g'  =  (G0  -  G)/p. 

If  we  introduce  this  value  into  (8),  we  obtain 

Gq\  +  g'p*  +  G'p\  =  A  Eq 

or,  if  the  integer  G'  +  g'  is  designated  as  &, 


(10) 


pv 

G  +  ®p  =  A~ 

?! 

We  now  choose  a  prime  number  p  so  large  that  (1)  p  >  |G|  and  (2)  the 
absolute  magnitude  of  the  right  side  of  (10)  is  smaller  than  1. 

Equation  (10)  then  contains  a  contradiction.  On  the  left  side  of  the  equation 
there  is  an  integer  that  is  indivisible  by  p  (because  G  f  0)  and  is  thus  not  equal  to 
zero,  while  on  the  right  there  is  a  number  whose  absolute  magnitude  is  less  than 
1.  This  is  impossible.  Consequently,  the  initial  equation  (1)  is  also  impossible 
and  Lindemann’s  theorem  is  proved. 

The  inferences  that  can  be  drawn  from  Lindemann’s  theorem  are  amazing. 
Here  we  present  only  a  few: 

1.  The  transcendence  of  e :  The  Euler  number  e  is  transcendent,  i.e.,  it  is 
not  an  algebraic  number.  (In  other  words,  it  cannot  be  a  root  of  an  algebraic 
equation  with  rational  coefficients.) 

2.  The  transcendence  of  n\  The  Archimedes  ( Ludolph )  number  n  is 
transcendent. 

According  to  Euler  (No.  13),  there  exists  the  equation 

^*+1=0. 

According  to  Lindemann’s  theorem  the  exponent  in  cannot,  therefore,  be  an 
algebraic  number.  Consequently,  it  is  also  impossible  for  n  to  be  an  algebraic 
number.  (If  n  were  algebraic,  then  the  product  of  the  two  algebraic  numbers  i  and 
n  would  have  to  be  algebraic.) 

Thus,  the  ancient  question  of  squaring  the  circle  is  answered,  though  the 
answer  is  negative: 

It  is  impossible  to  draw  with  a  compass  and  straight-edge  a  square  that  is 
equal  in  area  to  a  given  circle. 

If,  for  example,  we  choose  the  radius  of  the  given  circle  in  such  a  manner  that 
it  is  equal  to  the  unit  length,  the  area  of  the  circle  is  n  and  the  desired  side  of  the 
square  yT-  If  however,  y^  could  be  drawn  with  compass  and  straight-edge, 
then  the  square  n  of  this  segment  could  also  be  constructed,  and,  according  to 
No.  36,  n  would  have  to  be  the  root  of  an  algebraic  equation  with  rational 
coefficients  (whose  degree  would  be  a  power  of  2).  However,  n  is  transcendent. 

3.  The  exponential  curve  y  =  ex  passes  through  no  algebraic  point  of  the 
plane  except  the  point  0|  1 . 

(An  algebraic  point  is  a  point  whose  coordinates  x  and  y  are  both  algebraic 


numbers.)  Since  algebraic  points  are  omnipresent  in  densely  concentrated 
quantities  within  the  plane,  the  exponential  curve  accomplishes  the  remarkably 
difficult  feat  of  winding  between  all  these  points  without  touching  any  of  them. 

The  same  is,  naturally,  also  true  of  the  logarithmic  curve  y  =  lx. 

4.  The  sine  curve  y  =  sin  x  also  passes  through  no  algebraic  points  of  the 
plane  except  the  lattice  point  0|0. 

If,  for  example,  a\f  were  an  algebraic  point  situated  on  the  sine  curve,  f 
would  be  equal  to  sin  a  or,  since  2 i  sin  a  =  eia  -  e~ia,  eia  -  e~  m  -  2 if  =  0. 
However,  according  to  Lindemann’s  theorem,  this  equation  cannot  exist  for 
algebraic  numbers  a,  f. 


*  A  triangular  number  is  a  number  n  such  that  it  is  possible  to  construct  with  n  points  a  lattice  of 

congruent  equilateral  triangles  whose  vertexes  are  the  points.  The  first  triangle  numbers  are  1  =  •  1  •  2,  3 

=  1  +  2  =  $  ■  2  ■  3,  6  =  l+  2  +  3  =  $-  3-4,  10  =  l+  2  +  3+  4  =  $-  4-5,  etc. 

*  The  reader  who  is  unfamiliar  with  this  fact  will  find  the  proof  in  note  2  at  the  end  of  this  number,  p.  63. 

*  A  lattice  point  is  a  point  whose  coordinates  are  integers. 

*  Abel,  CEuvres  completes,  vol.  II,  p.  196. 

*  GAUSS’  THEOREM:  If  a  polynomial  f  —  xN  +  Cjx* *-1  +  C8xN~2  +  •  ♦  •  +  CN  with  integral 
coefficients  is  divisible  into  a  product  of  two  polynomials  i/>  =  xm  +  o,xm_l  +  •  •  •  +  and 

<p  =  xn  +  +  •  •  •  +  pn  with  rational  coefficients  (f  =  y/cp),  then  the  coefficients  of  this  polynomial 

are  integers. 

PROOF.  We  bring  av  and  flv  to  their  highest  common  denominators  uq  and  bfi  respectively,  so  that  av  = 
av/OQ  and ffi,  =  bv/bQ,  and  the  numbers  oq,  a\  ap,  . . .,  am  as  well  as  the  numbers  Fq,  b\  ...  bn,  possess  no 
common  divisor,  and  we  obtain 


F  =  TO  with  F  =  a0b0fi 

T  =  do*"  +  +  •  •  •  +  am,  O  =  box'  +  •  +  b„. 


Let  p  be  a  prime  divisor  of  ao^O- 

Then  all  the  coefficients  of  F  are  divisible  by  p,  but  not  by  i| i  and  <b.  We  combine  these  terms  of  i| /  and  <I\ 
respectively,  whose  coefficients  are  divisible  by  p,  to  form  the  respective  polynomials  U  and  V,  and 
similarly  combine  these  terms  whose  coefficients  are  not  divisible  by  p  to  form  the  polynomials  u  and  v,  so 
that  F  =  (U  +  u)(V+  v),  and  consequently 

uv  =  F  —  UV  —  Uv  —  Vu. 

The  right-hand  side  of  this  equation  contains  a  polynomial  in  which,  according  to  our  assumptions  for  F,  U, 
and  V,  every  coefficient  is  divisible  by  p;  the  left  side,  however,  does  not,  since  the  coefficient  of  the  highest 
power  of  the  left  side,  being  the  product  of  two  factors  ar  and  bs  that  are  not  divisible  by  p,  is  also  not 
divisible  by  p. 

This  contradiction  disappears  only  when  aq^o  ^as  no  prime  divisor,  i.e.,  when  uq  =  1  and  by  =  1,  in 
which  case  av  and  (>v  are  integers. 

*  N.  H.  Abel,  “Memoire  sur  une  classe  particuliere  d’equations  resolubles  algebraiquement,”  Crelle  s 
Journal,  vol.  IV,  1829. 

*  Leopold  Kronecker  (1823-1891),  a  German  mathematician. 


*  FERMAT’S  THEOREM  :  For  every  integer  g  and  every  prime  number  p  the  difference  gP  -  g  is 
divisible  by  p. 

PROOF.  The  theorem  is  self-evident  if  g  is  divisible  by  p.  For  every  g  that  is  indivisible  by  p  the  theorem 
follows  directly  from  the  congruences  (la)  and  (2a)  of  No.  19,  if  g  is  substituted  for  D  there  and  the 

congruences  are  squared.  In  both  cases  gP  - 1  s  l  mod  p  is  obtained,  and  from  this  gP  =  g  mod  p. 


Planimetrie  Problems 
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Euler’s  Straight  Line 


In  all  triangles  the  center  of  the  circumscribed  circle,  the  point  of  intersection 
of  the  medians,  and  the  point  of  intersection  of  the  altitudes  are  situated  in  this 
order  in  a  straight  line — the  Euler  line — and  are  spaced  in  such  a  manner  that 
the  altitude  intersection  is  twice  as  far  from  the  median  intersection  as  the  center 
of  the  circumscribed  circle  is. 

Leonhard  Euler  (1707-1783)  was  one  of  the  greatest  and  most  fertile 
mathematicians  of  all  time.  His  writings  comprise  45  volumes  and  over  700 
papers,  most  of  them  long  ones,  published  in  periodicals. 

The  above  theorem  is  among  the  results  of  the  paper  “Solutio  facilis 
problematum  quorundam  geometricorum  difficillimorum,”  which  appeared  in 
the  journal  Novi  commentarii  Academiae  Petropolitanae  {ad  annum  1 765). 

The  following  proof  of  the  Euler  theorem  is  distinguished  by  its  great 
simplicity. 

In  the  triangle  ABC  let  M  be  the  midpoint  of  side  AB,  S  the  median 
intersection,  which  lies  on  CM,  so  that 

(1)  SC  =  2SM, 

and  U  the  center  of  the  circle  of  circumscription,  lying  on  the  perpendicular 
bisector  of  AB. 

We  extend  US  by  SO  so  that 


(2)  SO  =  2  SU, 

and  join  O  to  C. 

According  to  (1)  and  (2)  the  triangles  MUS  and  COS  are  similar. 
Consequently,  CO\\MU,  i.e.,  COA.AB,  or  expressed  verbally,  the  line  connecting 
the  point  O  with  a  vertex  of  the  triangle  is  perpendicular  to  the  side  of  the 
triangle  opposite  the  vertex;  consequently,  the  connecting  line  is  an  altitude  of 
the  triangle. 

The  three  altitudes  consequently  pass  through  point  0.  This  is,  therefore,  the 
altitude  intersection,  and  Euler’s  theorem  is  proved. 

Note.  Our  proof  contains  at  the  same  time  the  solution  to  the  interesting 

Problem  of  Sylvester:  To  find  the  resultant  of  the  three  vectors  UA,  UB, 


UC  acting  on  the  center  of  the  circle  of  circumscription  U  of  the  triangle  ABC 


fig.  5. 

Since  UM  is  half  the  resultant  of  the  two  vectors  UA  and  UB,  CO  represents 
in  magnitude  and  direction  the  whole  resultant  of  these  vectors.  Now,  since  UO 
is  the  resultant  of  UC  and  CO,  UO  is  the  resultant  we  are  seeking. 

The  resultant  of  the  vectors  represented  by  the  three  radii  from  the  center  oj 
the  circle  of  circumscription  to  the  vertexes  of  the  triangle  is  the  segment 
extending  from  the  center  of  the  circle  of  circumscription  to  the  altitude 
intersection. 

James  Joseph  Sylvester  (1814-1897)  was  an  English  jurist  and 
mathematician. 
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The  Feuerbach  Circle 


In  every  triangle  the  three  midpoints  of  the  sides  the  three  base  points  of  the 
altitudes  the  midpoints  of  the  three  altitude  sections  touching  the  vertexes  lie  on 
a  circle 


This  circle  was  already  known  to  Euler  (1765),  but  is  most  commonly  called 
the  Feuerbach  circle  after  Karl  Feuerbach  (1800-1834)  [the  uncle  of  the  painter 
Anselm  Feuerbach],  who  rediscovered  it  in  1822.  It  is  also  known  as  the  nine- 
point  circle,  although  it  passes  through  many  other  significant  points  as  well  as 
those  indicated  above. 

The  proof  consists  of  two  steps.  In  the  first  we  demonstrate  that  the  circle 
circumscribing  the  triangle  of  the  three  midpoints  of  the  sides  passes  through  the 


base  points  of  the  altitudes;  and  in  the  second  we  show  that  the  circle 
circumscribing  the  triangle  of  the  altitude  base  points  passes  through  the 
midpoints  of  altitude  sections. 


A 


I.  Let  ABC  represent  the  prescribed  triangle,  A',  B',  C  the  midpoints, 
respectively,  of  sides  BC,  CA,  AB.  Let  H  be  the  base  point  of  the  altitude  AH. 
Then  the  trapezoid  HA'B'C'  is  isosceles  (. A'B ',  as  a  midline  of  the  triangle  ABC ,  is 
equal  to  iAB;  HC',  as  the  radius  of  the  Thales  circle  having  the  diameter  AB,  is 
also  equal  to  AB.)  The  trapezoid  is  therefore  a  quadrilateral  inscribed  in  a  circle. 
All  of  the  altitude  base  points  consequently  lie  on  the  circle  5  circumscribing  the 
triangle  A'B'C'. 


A 


fig.  8. 


II.  Let  the  altitudes  of  the  triangle  ABC  be  AH,  BK,  CL,  and  0  their  point  of 
intersection.  We  will  now  show  that  the  center  of  each  altitude  section  touching  a 
vertex,  let  us  say  section  OC,  also  lies  on  ft.  For  this  purpose  we  consider  the 
triangle  OBC,  which  also  has  the  altitude  bases  H,  K,  L.  According  to  I.,  the 
circle  ff  circumscribing  the  altitude  base  triangle  ( HKL )  of  this  triangle  passes 
through  the  triangle  at  the  side  midpoints,  e.g.,  through  the  center  of  OB  and  OC, 
which  completes  the  proof. 

Corollary.  The  midpoint  F  of  the  Feuerbach  circle  lies  at  the  center  of  the 
Euler  line  OU,  and  the  radius  f  of  the  Feuerbach  circle  is  equal  to  one  half  the 
radius  of  the  circle  of  circumscription  of  the  triangle  ABC. 

The  first  of  these  propositions  follows  from  the  fact  that  the  perpendicular 
bisectors  of  the  Feuerbach  circle  chords  HA'  and  KB',  as  midlines  of  the 
trapezoids  UOHA'  and  UOKB'  pass  through  the  center  of  OU,  and  the  second, 
from  the  fact  that  the  sides  of  the  triangle  A'B'C'  inscribed  in  the  Feuerbach 
circle  are  one  half  the  size  of  the  sides  of  the  triangle  ABC. 
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Castillon’s  Problem 


To  inscribe  in  a  given  circle  a  triangle  the  sides  of  which  pass  through  three 
given  points. 


This  problem,  posed  by  the  Swiss  mathematician  Cramer,  takes  its  name  from 
the  Italian  mathematician  Castillon,  who  solved  it  in  1776.  (Gabriel  Cramer, 
1704-1752,  in  1750  published  his  major  work  Introduction  a  T analyse  des  lignes 
courbes  algebraiques,  in  which  for  the  first  time,  a  system  of  linear  equations 
was  solved  by  means  of  determinants.  I.  F.  Salvemini,  1709-1791,  took  the  name 
Castillon  after  his  place  of  birth  Castiglione  in  Tuscany.) 

The  following  simple,  though  not  easily  seen,  solution  of  the  Castillon 
problem  stems  from  the  Italian  Giordano. 

We  call  the  given  circle  St ,  the  given  points  A,  B,C,  the  desired  triangle  XYZ, 
and  let  YZ,  ZX,  XY pass,  respectively,  through  A,  B,  C. 

Ottaiano  in  his  solution  makes  use  of  four  auxiliary  points.  These  are: 

I.  the  end  point  of  the  chord  parallel  to  AB  and  beginning  from  A; 

II.  the  point  of  intersection  of  the  lines  71  and  AB; 

III.  the  end  point  of  the  chord  beginning  at  X  that  is  parallel  to  IIC; 

IV.  the  point  of  intersection  of  the  lines  C II  and  I  III. 

The  construction  consists  of  the  following  five  steps. 


1.  Construction  of  auxiliary  point  II.  The  angles  ^411  I  and  XIY,  as 
alternate  interior  angles  between  parallels,  are  equal,  and  the  angles  XZY and  XiY 
are  equal  because  they  are  inscribed  in  the  same  arc  XY.  Consequently, 

&XZY  =  4dII  I 

and  therefore  BZYll  is  a  quadrilateral  inscribed  in  a  circle.  It  also  follows  from 
this  that 

AllAB  =  AY-AZ. 

Since,  however,  the  right  side  of  this  equation  is  known  to  be  the  power  P  of  the 
circle  St  at  A  (see  p.  152),  it  follows  that 

All  =  PjAB 

can  be  constructed,  as  a  result  of  which  II  is  known. 


I 


FIG.  9. 

2.  Construction  of  auxiliary  point  IV.  The  angles  FCIV  and  FAT  1 1  are 
corresponding  angles  between  parallels  and  are  consequently  equal,  while  angles 
FI  III  and  YX  III  are  supplementary  since  they  are  opposite  angles  in  the 
quadrilateral  inscribed  in  the  circle.  Thus,  FI  III  and  FCTV  are  also 
supplementary,  and  FCIV  I  is  a  quadrilateral  inscribed  in  a  circle.  It  follows  from 
this  that 


lie  ii  iv  =  iifii  i. 


However,  since  the  right  side  of  this  equation  represents  the  power  II  of  circle  it 
at  II,  which,  according  to  1.,  is  to  be  regarded  as  known,  we  find 

ii  iv  =  n/ii  c 


and  thus  the  auxiliary  point  IV. 

3.  Determination  of  the  angle  IXIII  =  co.  Since  angle  A 1 1IV  =  k  is  known 
and  since  co  and  k,  having  pairwise  parallel  sides,  are  identical,  it  follows  that 

U)  =  K. 

4.  Construction  of  the  chord  I  III.  We  draw  through  IV  a  chord 
subtending  the  angle  co  =  k  The  points  of  intersection  of  this  chord  with  Si  are  the 
remaining  points  I  and  III. 

5.  Construction  of  the  triangle  XYZ.  We  determine  X  as  the  point  of 
intersection  of  st  with  the  line  through  III  parallel  to  IIIV;  Y  as  the  point  of 
intersection  of  the  line  I  II  with  st;  and  Z  as  the  point  of  intersection  of  the  line 
AY  with  st. 

In  comparison  to  this  fairly  intricate  solution  the  following  projective  solution 
of  the  Castillon  problem  is  very  simple. 

This  solution  is  based  upon  Steiner’s  double  element  construction  (No.  60) 
and  the  involution  theorem:  If  a  ray  is  rotated  about  a  fixed  point ,  its  two  points 
of  intersection  with  a  circle  describe  on  this  circle  (involutional)  projective 
ranges  of  points  (No.  63). 

We  take  any  arbitrary  point  Xx  on  the  given  circle  st,  determine  the  (second) 
point  of  intersection  Zx  of  the  circle  with  the  secant  BXx,  then  the  (second)  point 
of  intersection  Yx  of  the  circle  with  the  secant  AZX  and,  finally,  the  (second)  point 
of  intersection  Xx  of  the  circle  with  the  secant  CYX.  Only  when  Xx  happens  to 
coincide  with  is  XxYxZx  the  sought-for  triangle.  This  favorable  situation  will, 

however,  occur  only  rarely.  We  will  consider  the  described  construction  as 
repeated  with  other  starting  points  X2X3,....,  giving  us  the  points  Y2,  Y3,...;  Z2, 

Z3,...;  X2,  X3,  According  to  the  auxiliary  theorem  each  of  the  fields  of  points 
Xx,  X2,...;Yx,  Y2,...;  Zx,  Z2,...;,  and  Xx,  X2  is  projective  with  respect  to  the 
following  one;  consequently, 

The  desired  triangle  is  obtained  from  the  described  construction  when  the 
starting  point  Xv  coincides  with  the  end  point  Xv  and  is  accordingly  determined 


by  a  double  element  of  this  projection.  This  gives  us  the  following  simple 

Construction:  We  choose  any  three  points  Xb  X2,  X3  on  St,  draw  in  the 
manner  described  the  three  corresponding  points  Xx,  X2,  X3  and  determine 
according  to  Steiner  the  double  elements  Xr  and  Xs  of  the  projection  on  it  in 
which  the  points  Xx,  X2,  X3  correspond  to  Xx,  X2,  X3.  Thus,  each  of  the  two 
triangles  X,  Y,  Z  and  XSYSZS  satisfies  the  conditions  of  the  Castillon  problem. 

Note.  In  a  quite  similar  manner  we  are  able  to  prove  the  converse  of  the 
Castillon  problem: 

To  draw  about  a  circle  a  triangle  the  angles  of  which  lie  on  three  given  lines. 

The  construction  is  based  upon  the  auxiliary  theorem: 

If  a  point  describes  a  straight  line,  the  two  tangents  from  the  point  to  a  circle 
determine  upon  this  circle  two  ( involutional )  projective  fields  of  tangents  (No. 
63). 

We  call  the  given  circle  it,  the  given  lines  a,  b,  c  the  sides  of  the  desired 
triangle  x,  y,  z. 

We  draw  any  three  tangents  xx,x2,  x3  to  it;  through  their  points  of  intersection 
with  b  we  draw  three  more  tangents  zx,  z2,  z3;  through  the  points  of  intersection 
of  the  latter  with  a  we  draw  three  new  tangents  yx,  y2,  y3,  and  through  their 
intersections  with  c  three  more  tangents  xj,  x'2,  x'3.  We  draw  the  double 
elements  xr  and  xs  of  the  projection  defined  on  if  by  the  homologous  triplets  (xl5 
x2,  x3)  and  (x'1?  x'2,  x'3).  The  triangles  xryfzr  and  obtained  from  these  double 
elements  are  the  ones  we  are  seeking. 
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Malfatti’s  Problem 


To  draw  within  a  given  triangle  three  circles  each  of  which  is  tangent  to  the 
other  two  and  to  two  sides  of  the  triangle. 


This  famous  problem  was  posed  by  the  Italian  mathematician  Malfatti  (173 1— 
1 807)  in  1 803  and  solved  in  the  tenth  volume  of  the  Memorie  di  Matematica  e  di 
Fisica  della  Societd  italiana  delle  Scienze.  This  algebraic-geometric  solution  can 
be  found,  for  example,  in  vol.  123  of  Ostwald’s  Klassiker  der  exakten 
Wissenschaften  (Supplement).  The  purely  geometric  solution  of  Malfatti’s 
problem  submitted  by  Jakob  Steiner  in  1 826  without  proof  is  also  described  and 
proved  there.  Here  we  will  restrict  ourselves  to  the  exposition  of  the  thoroughly 
simple  solution  published  by  Schellbach  in  volume  45  of  Creile ’s  Journal. 


Let  ABC  be  the  given  triangle  with  sides  a ,  b,  c,  the  perimeter  2s  and  the 
angles  a,  ft,  y.  Let  the  Malfatti  circles  we  are  seeking  (which  are  tangent  to  the 
arms  of  the  angles  a,  ft,  y)  be  s^,  C,  9t„  their  midpoints  P,  Q,  R,  and  their  radii  p, 
q,  r.  Let  the  tangents  from  the  angles  A,  B,  C  to  D,  91  be  u,  v,  w. 


c 


We  introduce  5,  a  circle  inscribed  in  the  triangle.  Let  its  center  be  J  and  its 
radius  p,  and  let  the  tangents  to  it  from  angles  A,  B ,  C  be  ax,  bh  q  respectively. 

From  the  three  equations 

+  Ci  =  a,  ci  L  ai  —  b,  al  +  =  c 

we  obtain  the  values 

a i  — •  s  A,  h  =  s  by  c i  —  s  c . 

Since  the  points  P  and  J  lie  on  the  bisector  of  the  angle  a,  it  follows  from  the 
ray  theorem  that 


pip  —  ujaY  or  /»  =  —«. 

ai 


Similarly  we  find  ?  =  Jp  v. 

We  call  the  points  of  tangency  of  ^  and  a  with  AB,  U  and  V  and  calculate  UV 
=  t.  Since  PF,  the  perpendicular  dropped  from  P  to  QV,  is  equal  to  t,  it  follows 
from  the  right  triangle  PQF  that 


PQ 2  =  PF2  +  FQ2  or  (p  +  q)2  =  t2  +  (/>  -  q)2 


and  from  this 


UV  =  t  =  2Vpq. 

If  we  then  introduce  here  the  values  found  above  p  and  q,  we  obtain 


FIG.  11. 


But  it  is  known  that 


This  simplifies  the  value  for  t  to 

UV  =  t  =  2 ys  Wv. 


Since  the  side  AB  of  the  triangle  is  composed  of  the  three  segments  AU,BV,  and 
UV,  we  obtain  the  equation 


u  +  V  +  2 


In  the  same  way  we  obtain  for  the  two  other  sides  of  the  triangle  BC  and  CA 


and 


Taking  half  the  perimeter  as  the  unit  length,  we  obtain  somewhat  more 
simply: 


(1) 


Now  we  take  the  proper  fractions  a,  b,  c,  u,  v,  w  as  squares  of  the  sines  of  six 
acute  angles  A,  /x,  v,  >P,  x> 


sin2  A  =  a,  sin2  fi  —  b,  sin2  v  =  c, 

sin2  ip  =  u,  sin2  <p  =  v,  sin2  \  =  w. 


Then  also  (since  a  +  ax  =  s  =  1,  b  +  by  =  1,  c  +  c1  =  1)  cos2/l  =  ab  cos2//  =  bx, 
cos2v  =  C|,  and  the  obtained  equation  triplet  (1)  assumes  the  form: 


'sin2  <p  +  sin2  x  +  2  sin  <p  sin  x  cos  A  =  sin2  A, 


(2)  sin2  x  +  sin2  ^  +  2  sin  x  sin  ^  cos  M  =  sin2 

ssin2  ip  +  sin2  <p  +  2  sin  tp  sin  <p  cos  v  =  sin2  v. 


Now,  for  example,  let  us  consider  the  first  of  these  equations!  It  is  nothing  other 
than  a  trigonometric  expression  of  the  known  relation  (y  +  x  =  \)  between  the 
angles  cp  and  /  of  the  two  vertexes  of  a  triangle  and  the  exterior  angle  A  of  the 
third  vertex.  If,  for  example,  we  take  such  a  triangle  with  a  circle  of 
circumscription  of  the  diameter  1,  then  the  three  sides  are  sin  (p,  sin/,  sin  A,  and 
the  cosine  theorem  gives  the  equation 


sin2  A  =  sin2  <p  +  sin2  x  +  2  sin  <p  sin  x  cos  A. 


It  then  follows  from  (2)  that 


<p  +  X  =  A,  x  +  'P-^y  'P  +  <P  ®  v 


and  from  this 


i/t  =  a  -  A, 


<p  =  a  -  n, 


X  =  o  —  v,  with  a 


A  4-  n+  v 
2 


Thus,  we  obtain  the  following  simple 

Construction: 

1 .  We  draw  three  angles  A,  ju,  v  whose  sine  squares  are  equal  to  the  sides  of 
the  given  triangle  (where  half  the  perimeter  of  the  triangle  is  the  unit  length). 

2.  We  draw  the  half  sum 


A  +  n  +  v 
- - 2 - 

of  the  three  angles  A,  ju,  v  and  the  three  new  angles 

^  =  <7  —  A,  <p  =  o  -  n,  x  =  °  -  v- 

3.  We  draw  the  sine  squares  of  the  three  angles  y/,  (p,  These  are  the 
tangents  from  the  triangle  vertexes  to  the  three  Malfatti  circles. 

Note.  If  we  are  to  draw  the  sine  square  m  =  sirrvv  for  a  given  angle  vv,  or  to 
draw  the  angle  w  (whose  sine  square  equals  m)  for  a  given  segment  m,  we 
proceed  in  the  following  manner: 

We  draw  a  semicircle  &  with  the  diameter  HK  =  1 .  We  draw  the  given  angle  w 
at  K  on  KH  and  from  the  intersection  L  of  its  free  side  with  we  drop  the 
perpendicular  LM  to  HK.  Then  HM  =m  =  sin2  vv. 

Conversely,  if  m  is  given  and  we  have  to  find  w,  we  draw  HM  =  m  on  HK, 
erect  at  M  a  perpendicular  on  HI<  extending  to  the  intersection  L  with  £>,  and 
extend  LK.  Then  &HKL  =  w. 

Proof.  From  the  right  triangle  HML  it  follows  that 

m  =  HM  =  HL  - sin  HLM  =  HL  sin  w, 


and  from  the  right  triangle  HKL 

HL  =  HK  sin  w  «  sin  w. 


m  =  sin3  w. 


Consequently, 
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Monge’s  Problem 


To  draw  a  circle  that  cuts  three  given  circles  perpendicularly. 

The  French  mathematician  Monge  (1746-1818)  was  the  founder  of 
descriptive  geometry. 

In  order  to  solve  the  problem,  we  seek  the  locus  of  the  centers  of  all  the 
circles  that  are  perpendicular  to  two  given  circles. 

[Two  circles  are  said  to  intersect  perpendicularly  when  the  radii  r  and  r' 
drawn  to  a  single  point  of  intersection  are  perpendicular  to  each  other;  in  other 
words,  when  they  form  the  base  and  altitude  of  a  right  triangle  the  hypotenuse  z 
of  which  joins  the  centers  of  the  circles,  so  that  r2  +  r'2  =  z2  or  z2  -  r2  =  r'2.  Two 
circles  are  therefore  perpendicular  to  each  other  when  the  power*  of  the  one  at 
the  midpoint  of  the  other  is  equal  to  the  square  of  the  radius  of  the  other.] 


FIG.  12. 

Let  the  given  circles  be  if  and  it ',  their  centers  K  and  K',  their  radii  k  and  k' 
(>k),  the  line  joining  their  centers  KK'  =  /.  Let  the  circle  £  with  the  midpoint  X 
and  the  radius  x  be  perpendicular  to  them.  Let  the  center  lines  KX  and  K'X  be 
equal  to  z  and  z',  respectively.  Then  z2  -  k 2  and  z'2  -  k'2  are  each  equal  to  x2,  so 
that 

(1)  z2  -  k2  =  z'2  -  k'2. 

Consequently,  both  circles  it  and  it  '  have  the  same  power  atX  We  therefore  first 
attempt  to  find  the  locus  of  the  point  X  at  which  the  two  given  circles  possess  the 


same  power.  If  X  is  a  point  possessing  this  locus  and  the  perpendicular  from  X 
intercepts  the  center  line  KK'  at  the  point  F,  and,  moreover,  if  KF  =/ and  K'F  =j 
',  then,  according  to  the  Pythagorean  theorem,  the  square  of  the  perpendicular  is 
equal  to  z2  -f2  as  well  as  to  z'2  -  f 2,  so  that 

(2)  z2  -P  =  z'2  -f'\ 


If  we  subtract  (2)  from  (1)  we  obtain 


(3) 


P  -  *2  =/'2 


-  k' 


2 

> 


i.  e.,  St  and  St'  possess  equal  powers  at  F  also.  If  we  figure  the  distances / and/ 
as  positive  in  the  directions  KK'  and  K'K,  respectively,  then  it  is  always  true  that 

(4)  /+/'  =  /• 

Equations  (3)  and  (4)  give  us  fixed  values  for  the  unknowns  /  and  /. 
Consequently  every  locus  point  X  lies  on  the  perpendicular  erected  on  the  center 
line  KK'  at  the  fixed  point  F,  and  we  obtain  the 

Theorem  of  the  chordal:  The  locus  of  the  point  at  which  two  given  circles 
possess  the  same  powers  is  a  straight  line  perpendicular  to  the  line  joining  the 
midpoints  of  the  circles  and  is  known  as  the  chordal  or  power  line  of  the  two 
circles. 

In  the  construction  of  the  chordal  we  distinguish  two  different  cases: 

1 .  The  circles  intersect.  Since  both  circles  have  equal  powers  at  each  of  their 
points  of  intersection,  i.e.,  O,  the  points  of  intersection  lie  on  the  chordal.  The 
chordal  of  two  circles  that  intersect  is  the  secant  of  intersection. 

2.  The  circles  do  not  intersect.  Here  the  construction  of  the  chordal  is  based 
upon  the 

Theorem  of  monge:  The  three  chordals  of  three  circles  pass  through  a  point 
known  as  the  power  center  of  the  three  circles. 

[Proof.  Let  the  circles  be  I,  II,  III.  We  determine  the  point  of  intersection  O 
of  the  chordals  of  the  two  pairs  (II,  III)  and  (III,  I).  At  this  point  (1)  II  and  III,  (2) 
III  and  I  possess  equal  powers;  consequently  II  and  I  also  have  the  same  power 
at  O,  i.e.,  O  lies  on  the  chordal  of  I  and  II.] 

Thus,  to  construct  the  chordal  of  two  nonintersecting  circles  I  and  II,  we  draw 
an  auxiliary  circle  III  that  intersects  I  and  II  and  the  chordals  of  the  pairs  (II,  III) 
and  (III,  I).  The  perpendicular  from  the  intersection  of  these  chordals  to  the  line 
joining  the  centers  of  I  and  II  is  the  chordal  we  are  looking  for. 


From  the  theorem  of  the  chordal  it  then  follows: 

The  locus  of  the  centers  of  all  circles  that  are  perpendicular  to  two  given 
circles  is  the  chordal  of  the  given  circles  or,  in  the  event  that  these  circles 
intersect,  the  section  of  the  chordal  that  lies  outside  the  given  circles.  (The 
powers  of  the  given  circles  at  a  single  point  must  be  positive!) 

The  solution  of  Monge’s  problem  now  becomes  very  simple.  We  draw  the 
power  center  O  of  the  given  circles.  If  it  lies  outside  the  three  circles,  the  circle 
with  the  midpoint  O  and  the  radius  formed  by  the  tangent  from  O  to  one  of  the 
given  circles  intersects  perpendicularly  with  the  given  circles.  If  O  is  located 
inside  even  one  of  the  given  circles,  the  problem  is  insoluble. 
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The  Tangency  Problem  of  Apollonius 


To  draw  a  circle  that  is  tangent  to  three  given  circles. 

The  circles  may  also  comprise  degenerate  circles:  points  or  straight  lines. 

This  celebrated  problem  was  put  forth  by  the  greatest  mathematician  of  the 
ancient  world  after  Euclid  and  Archimedes,  Apollonius  of  Perga  (ca.  260-170 
b.c.),  whose  major  work  Kwvixd  extended  with  an  astonishing  comprehensiveness 
the  period’s  naturally  slight  knowledge  of  conic  sections.  His  treatise  De 
Tactionibus,  which  contained  the  solution  of  the  tangency  problem  given  above, 
has  unfortunately  been  lost.  Francois  Viete,  called  Vieta,  the  greatest  French 
mathematician  of  the  sixteenth  century  (1540-1603),  attempted  about  1600  to 
restore  the  lost  treatise  of  Apollonius  and  solved  the  tangency  problem  by 
treating  each  of  its  ten  special  cases  individually,  deriving  each  successive  one 
from  the  preceding  one.  In  contrast  to  this  the  solutions  of  Gauss  ( Complete 
Works,  vol.  IV,  p.  399),  Gergonne  ( Annales  de  Mathematiques ,  vol.  IV),  and 
Petersen  ( Methoden  und  Theorien)  solve  the  general  problem. 

Here  we  will  restrict  ourselves  to  the  exposition  of  the  elegant  solution  oj 
Gergonne.  Since  this  proof  presupposes,  in  addition  to  the  chordal  theorems 
proved  in  No.  31,  a  knowledge  of  the  properties  of  similarity  points  and  polars, 
we  will  begin  with  a  brief  discussion  of  these. 


Similarity  Points 


When  we  refer  to  the  external  or  positive  and  internal  or  negative  similarity 
points,  respectively,  of  two  circles  it  and  it  '  with  the  centers  M  and  M'  and  the 
radii  r  and  r',  we  mean  the  points  A  and  J,  respectively,  on  the  line  MM'  joining 
the  centers  for  which 


MA  r  ,  MJ  r 
, .  =  H — ;  and  -777-.  =  — -■>  respectively.* 

A1  A  r  M  J  T 

It  follows  directly  from  the  ray  theorem  that: 

The  line  connecting  the  end  points  of  two  parallel  ( oppositely  directed)  radii 
of  two  circles  passes  through  the  external  ( internal)  similarity  point. 

In  particular,  the  external  (internal)  common  tangents  of  the  two  circles  pass 
through  the  external  (internal)  similarity  point.  We  will  further  designate  the 
external  similarity  point  of  the  circles  St  and  St'  as  +  St  St',  the  internal  one  as  -St  St 
',  and,  if  the  sign  is  not  determined,  we  will  indicate  the  similarity  point  as  cStSV. 
The  symbol  se's" ...  is  to  be  understood  as  meaning  plus  when  the  number  of 
minus  signs  occurring  among  the  symbols  s,  s',  b",...  is  even  and  minus  when  it 
is  odd 


FIG.  13. 


The  similarity  points  of  three  circles  are  described  by  the 

Theorem  of  D’Alembert:*  If  three  circles  91,  33,  ($  are  taken  in  pairs  (93,  <5  ),  ( 
e  ,  91),  and  (91,  93),  the  external  similarity  points  of  the  three  pairs  lie  on  a 
straight  line;  and,  similarly,  the  external  similarity  point  of  one  pair  and  the  two 
internal  similarity  points  of  the  other  two  pairs  lie  upon  a  straight  line,  a  so- 
called  similarity  axis  of  the  three  circles.  More  briefly: 

If  afiy  is  plus,  the  three  similarity  points  a93(5  ,  91,  and  y9193  lie  on  a  straight 

line. 

Monge’S  proof.  Let  the  centers  of  the  circles  91,  93,  (5  be  A,  B,  C,  and  let 
the  external  similarity  points  of  the  pairs  (93,  (S  ),  (G  ,  9l),  (91,  93)  be  P,  Q,  R.  If  the 
circle  pair  (93,  (5  )  with  its  external  tangents  that  pass  through  P  is  rotated  about 
the  axis  PBC,  we  obtain  the  spheres  930  and  ($  0  and  their  tangent  cone  with  apex 
P.  The  case  is  similar  for  the  other  two  circle  pairs. 

The  planes  El  and  E2  are  tangent  to  the  spheres  9I0,  330,  (5  0  'n  such  a  manner 
that  the  spheres  always  lie  on  one  side  of  the  plane,  and  both  planes  contain  the 


point  P,  since  this  point  lies  on  the  external  tangent  of  (330,  <5  o)  within  EfE^. 
They  likewise  contain  the  points  Q  and  R. 

The  three  points  P,  Q,  R  thus  lie  on  the  line  of  intersection  of  the  planes  Ex 
and  E2. 

If  we  are  concerned  with  the  internal  similarity  points  of  the  pairs  (33,  ($  )  and 
(91,  $ )  and  the  external  similarity  point  of  (91,  33),  we  must  take  the  tangential 
planes  so  that  9I0  and  330  lie  on  one  side  of  such  a  plane  while  (S  0  lies  on  the 
other. 

Let  an  arbitrary  circle  £  with  the  center  X  be  homogeneously 
(nonhomogeneously)  tangent  to  two  fixed  circles  ft  and  ft',  with  centers  K  and  K' 
and  radii  k  and  k'  at  P  and  Q.  Let  the  points  of  intersection  of  the  straight  line 
PQ  with  the  circles  ft  and  ft'  the  line  KK'  joining  their  centers  be  P,  Q;  F,  Q' 
and  S 


Since  the  base  angles  of  the  isosceles  triangles  KPQ,  K'P'Q',  and  XPQ'  are 
also  the  opposite  and  coincident  angles  at  P  and  Q',  all  six  base  angles  are  equal. 
Since  the  two  base  angles  at  P  and  P'  are  equal,  the  radii  KP  and  K'P'  are 
parallel.  Consequently,  S  is  the  external  (internal)  similarity  point  of  ft  and  ft  '. 
From  this  it  follows  that 


SP_  __  k  SQ  _  k_ 
SP'  ~  ±F  SQ'  ~  ±F 


so  that  the  two  products  SP-SQ'  and  SQ  SP'  are  equal.  If  we  call  their  common 


value  w,  then 


w2  =  SPSQ'SQ  SP'  =  SPSQ  •  SP'SQ’, 

i.e.,  w2  is  equal  to  the  product  of  the  powers  11  and  11'  of  the  two  circles  St  and  St 
'  at  S.  Consequently, 

sp-SQ’  =  w  =  V7ITP. 

Le.:  The  power  {SPSQ')  of  the  circle  at  £  is  a  constant  (Villi'). 


FIG.  15. 

The  result  of  our  considerations  is  the  following 

Tangency  theorem:  The  external  ( internal)  similarity  point  of  two  fixed 
circles  is  the  point  at  which  all  the  circles  homogeneously  ( nonhomogeneously ) 
tangent  to  the  fixed  circles  have  the  same  power  and  at  which  all  the  tangency 
secants  (which  are  determined  by  the  points  of  tangency  to  the  fixed  circles) 
intersect. 

Pole  and  polar 

Two  points  P  and  P'  that  lie  on  a  ray  originating  at  the  center  O  of  a  circle  5t 
with  radius  r  in  such  manner  that 

OP  OP'  =  r3 

are  called  conjugate  with  respect  to  each  other  in  relation  to  the  circle.  Of  two 
conjugate  points  one  lies  inside  the  circle  and  the  other  outside. 


The  conjugate  of  an  external  point  A  is  the  point  of  intersection  J  of  the  circle 
bisector  from  A  with  the  tangency  chord  determined  by  the  tangents  AT x  and  A T2 
from  A  to  the  circle. 

The  conjugate  of  an  internal  point  J  is  the  point  of  intersection  A  of  the 
tangents  that  pass  through  the  end  points  7j  and  T2  of  the  chord  passing  through 

J  and  perpendicular  to  the  circle  bisector  from  J. 


(From  the  right  triangle  OAT\  it  follows  directly  that  r2  =  OA.  OJ.)  By  the 

polar  of  the  point  P  we  mean  the  line  p  that  is  perpendicular  to  the  circle  bisector 
from  P  and  passes  through  the  conjugate  of  P. 

Conversely,  by  the  pole  of  the  line  p  we  mean  the  point  P  that  is  conjugate  to 
the  base  point  of  the  perpendicular  dropped  from  the  center  of  the  circle  to  the 
line. 

The  relation  between  the  pole  and  the  polar  is  therefore  reciprocal:  If  p  is  the 
polar  of  P,  then  P  is  the  pole  of  p,  and  conversely. 

Now  let  Q  be  any  point  on  the  polar  p  of  P  (that  passes  through  the  conjugate 
P'  of  P )  and  let  Q'  be  the  conjugate  of  Q.  Then 

OP  OP'  -  OQ  OQ'  (=  r3), 


and  consequently  PP'QQ'  is  a  quadrilateral  inscribed  in  a  circle.  Since  here  the 
angle  at  P’  is  90°  the  angle  at  Q  must  also  be  90°,  i.e., 


FIG.  17. 


PQ'  must  be  perpendicular  to  OQ.  PQ'  is  therefore  the  polar  q  of  Q,  and  we  have 
the 

Theorem  of  the  pole  and  polar:  IfQ  lies  on  the  polar  of  P,  P  also  lies  on  the 
polar  ofQ.  Or  also:  If  p  passes  through  the  pole  of  q,  q  also  passes  through  the 
pole  ofp. 

Now  for  Gergonne  s  solution  of  the  tangency  problem. 

In  general,  there  are  a  number  of  circles  that  are  tangent  to  three  given  circles 
9t,  S,  G..  Gergonne’s  solution  is  based  upon  the  device  of  seeking  the  unknown 
circles  in  pairs  rather  than  individually;  in  particular,  one  always  seeks  that  pair 
(36,  j)  that  is  homogeneously  or  nonhomogeneously  tangent  to  each  of  the  given 
circles. 

For  the  sake  of  convenience,  we  will  call  homogeneous  tangencies  positive 
(+)  and  nonhomogeneous  tangencies  negative  (-)  and  combinations  such  as  ss' 
of  the  tangency  signs  s  and  s'  will  be  treated  in  accordance  with  the  rule  that 
“like  signs  give  plus  and  unlike  minus.” 

Let  the  circles  36  and  i,  respectively,  be  tangent  to  the  circles  9t,  93,  G.  at  the 
points  P,  Q,  R  and  p,  q,  r,  respectively,  and  let  the  tangencies  possess  the  signs 
A,  B,  C  and  a,  b,  c  respectively.  Then 

Aa  =  Bb  =  Cc  —  e, 


and 


O, 


BC  —  be  = 


CA  =  ca  =  AB  =  ab  =  y 


and 


apy  =  +. 

Let  us  first  consider  ($,  e.)  as  the  pair  tangent  to  the  circles  2t,  58,  G.  According 
to  the  tangency  theorem,  the  similarity  point  eXl.  of  X  and  E  is  the  power  center 
O  of  the  three  circles  21,  23,  G.  and  the  point  of  intersection  of  the  three  tangency 
chords  Pp,  Qq,  Rr. 

We  then  take  in  succession  (23,  G),  (G,  2(),  (21, 58)  as  the  pair  tangent  to  the 
circles  X  and  E,  In  accordance  with  the  tangency  theorem,  the  circles  X  and  E 
then  have  the  same  powers  at  the  similarity  point  «23G  =  I,  as  well  as  at  the 
similarity  point  0G2I  =  II,  and  the  similarity  point  y2l23  =  III.  And  since  afy  is  +, 
the  three  points  1,  11,  III,  in  accordance  with  d’Alembert’s  theorem,  lie  upon  a 
similarity  axis  of  21,  58,  G..  The  similarity  axis  I  II  III  is  thus  the  chordal  y  of  the 
circles  X  and  E, 

Further,  if  S  represents  the  point  of  intersection  of  the  tangents  to  21  at  P  and 
p,  then  SP  =  Sp.  Since  these  tangents  also  touch  X  and  E.,  S  lies  on  the  chordal  y 
of  X  and  E,  Now  S  is  also  the  pole  of  the  tangency  chord  Pp  with  respect  to  circle 
21 .  Since  y  therefore  passes  through  the  pole  of  Pp,  it  follows  from  the  theorem 
of  the  pole  and  polar  that  Pp  passes  through  the  pole  of  X-  Since  the  same 
conclusions  can  be  drawn  with  respect  to  the  tangency  chords  Qq  and  Rr,  we 
obtain  the  theorem:  The  tangency  chords  Pp,  Qq,  and  Rr  pass  respectively 
through  the  poles  of  the  line  x  =  I II  HI  with  respect  to  the  circles  2t,  58,  G.. 
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From  the  three  theorems  italicized  in  the  last  three  paragraphs  we  obtain 
directly 

Gergonne’S  construction:  Draw  the  power  center  O  of  the  given  circles 
and  the  similarity  axis  1 11  111  =  y.  Determine  the  poles  1,2,3  of  y  in  relation  to 
the  given  circles  and  connect  them  with  O.  The  connecting  lines  touch  the  given 
circles  at  the  points  at  which  they  are  tangent  to  the  sought-for  circles. 
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Mascheroni’s  Compass  Problem 


To  prove  that  any  construction  that  can  be  carried  out  with  a  compass  and 
straight-edge  can  be  carried  out  with  the  compass  alone. 

The  Italian  L.  Mascheroni  (1750-1800)  posed  himself  the  problem  of 
executing  the  geometric  constructions  with  a  compass  alone  (without  the  use  of 
the  straight-edge)  and  solved  it  in  a  masterly  fashion  in  his  book  La  geometria 
del  compasso,  which  was  published  in  Pavia  in  1797. 

If  we  examine  the  separate  steps  by  which  the  circle  and  straightedge 
constructions  are  carried  out,  we  see  that  every  step  consists  of  one  of  the 
following  three  basic  constructions: 


I.  Finding  the  point  of  intersection  of  two  straight  lines; 

II.  finding  the  point  of  intersection  of  a  straight  line  and  a  circle; 

III.  finding  the  point  of  intersection  of  two  circles. 

Consequently,  we  need  only  show  that  the  two  basic  constructions  I.  and  II. 
can  be  accomplished  with  a  compass  alone.  (In  Mascheroni’s  geometry  of  the 
compass  a  straight  line  is,  naturally,  regarded  as  given  or  determined  if  two  of  its 
points  are  known.) 

First  we  must  solve  two  preliminary  problems. 

Preliminary  problem  1.  To  draw  the  sum  or  difference  of  two  given 
segments  a  and  b. 

In  other  words:  to  lengthen  or  shorten  a  given  segment  PQ  =  a  by  a  segment 
QX=  b. 

Solution.  1.  We  draw  the  arc  Q\b,*  take  upon  this  arc  any  poin  H,  draw  the 
mirror  image  H'  of  H  (the  mirror  image  O'  of  a  point  O  on  a  straight  line  AB  is 
the  point  of  intersection  of  the  arcs  A\AO  and  B\BO)  on  the  straight  line  9 
determined  by  the  points  P  and  Q,  and  designate  the  segment  FIH'  as  h.  2.  We 
draw  the  isosceles  trapezoid  KHH'K'  whose  legs  KH  and  K'H'  are  equal  to  b  and 
whose  base  KK'  =  2 h.  ( K  is  the  point  of  intersection  of  the  arcs  Q\h  and  H\b,  K'  is 
the  mirror  image  of  K  on  g .)  Let  the  diagonal  KH'  =  HK'  of  the  trapezoid  be 
called  d.  Since  the  trapezoid  is  a  quadrilateral  that  can  be  inscribed  in  a  circle, 
according  to  Ptolemy  the  following  equation  is  applicable: 

d2  =  b2  +  2  h2. 


On  the  other  hand,  it  follows  from  the  right  triangle  QK'X,  where  K'X  will  be 
designated  as  x,  that 


*a  =  b2  +  h2. 

From  these  two  equations  it  follows  that 

d2  =  x 3  +  h2t 

so  that  x  is  one  of  the  legs  of  a  right  triangle  with  the  hypotenuse  d  and  the  other 
leg  h.  If  we  then  find  the  point  of  intersection  S  of  the  arcs  K\d  and  K'\d  on  the 
straight  line  9,  QS  =  x.  3.  We  draw  the  point  of  intersection  of  the  arcs  K  \  x  and 
K'\x;  this  is  the  point  X  that  we  have  been  trying  to  find. 


Preliminary  problem  2.  To  find  the  fourth  segment  x  that  is  in  proportion 
to  the  three  given  segments  m,  n,  s. 

In  other  words,  draw  the  segment 


n 

X  as  —  S. 

m 

The  following  solution  that  Mascheroni  found  for  this  fundamental  problem  is 
remarkable  for  its  shortness  and  simplicity. 

We  draw  two  concentric  circles  3D?  =  Z\m  and  9?  =  Z\n,  draw  the  chord  AB  =  s 
in  3D?,  lay  off  with  the  compass  any  length  w  from  A  and  from  B  on  9? ,  obtaining 
from  the  distance  between  the  resulting  points  of  intersection  H  and  K  the 
sought-for  segment  x.  The  proof  follows  directly  from  the  similar  triangles  ZAB 
and  ZHK. 
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In  this  construction  it  is  assumed  that  s  falls  within  circle  30? .  If  this  is  not  the 
case,  we  first  transform  the  fraction  n\m  into  N\M,  where  N  and  M,  respectively, 
are  sufficiently  great  integral  multiples  of  n  and  m  which  can  be  drawn  according 
to  the  first  preliminary  problem.  (A  comparatively  simple  method  is  the  doubling 
that  results,  for  example,  when  PQ  =  m,  and  the  radius  m  of  the  circle  P\PQ  is 
laid  off  three  times  in  succession  from  Q.  The  end  point  after  this  laying  off  is 
separated  from  Q  by  the  distance  2m.) 

After  the  solution  of  the  preliminary  problems,  we  go  on  to  the  solution  of  the 
two  major  problems. 

F.  To  find  the  point  of  intersection  S  of  two  straight  lines  AB  and  CD  (each  of 
which  is  given  by  two  points)  with  the  compass  alone. 

IT.  To  determine  the  point  of  intersection  S  of  a  given  circle  ft  and  a  given 
straight  line  AB  with  the  compass  alone. 
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Solution  of  r.  We  draw  the  mirror  images  C  and  D'  of  C  and  D  with  respect 
to  AB.  The  sought-for  point  of  intersection  S  then  also  lies  on  C'D'.  According  to 
the  ray  theorem,  it  follows  that  CS/SD  =  CC/DD ',  i.e.,  if  we  designate  the 
segments  CS,  CD,  CC,  DD’  as  x,  e,  c,  d,  respectively,  x/(e  -x)  =  dd  or 


c 


Now  we  begin  by  drawing  CH  =  c  +  d  (H  as  the  point  of  intersection  of  the 
arcs  C'\d  and  D\e)\  then  we  draw  the  segment  x  in  accordance  with  preliminary 
problem  2;  and  finally  we  draw  the  sought-for  point  of  intersection  S  as  the 
intersection  of  the  arcs  Clx  and  C lx. 


Solution  of  IF.  Let  the  center  of  the  given  circle  be  known  as  M ,  the  radius 
as  r.  We  draw  the  mirror  image  M  of  M  with  respect  to  the  straight  line  AB  and 
with  the  compass  open  to  the  radius  r  we  strike  off  r  on  the  circle  ft  from  M .  The 
resulting  points  of  intersection  are  the  sought-for  points  of  intersection  of  the 
given  straight  line  AB  with  the  given  circle  ft  . 

The  construction  cannot  be  carried  out  if  the  straight  line  AB  happens  to  pass 
through  M,  In  this  exceptional  case  we  extend  and  shorten  the  segment  AM  by  r 
in  accordance  with  preliminary  problem  1 .  The  end  points  of  the  extended  and 
shortened  segment  are  the  sought-for  points  of  intersection  of  ft  and  AB. 

This  completes  the  solution  of  Mascheroni’s  problem. 
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Steiner’s  Straight-edge  Problem 


To  prove  that  every  construction  that  can  be  executed  with  compass  and 
straight-edge  can  be  executed  with  a  straight-edge  alone  in  the  event  that  within 
the  picture  plane  there  is  also  given  a  fixed  circle. 

As  far  back  as  1759  Lambert  had  solved  a  whole  series  of  geometric 


constructions  with  straight-edge  alone  in  his  book  Freie  Perspective,  which  was 
published  in  Zurich  that  year.  He  is  also  the  source  of  the  term  “straight-edge 
geometry.”  After  Lambert  the  French  mathematicians,  primarily  Poncelet  and 
Brianchon,  took  up  straightedge  geometry,  particularly  after  the  publication  of 
Mascheroni’s  Geometria  del  compasso  provided  a  new  stimulus  to  these  studies, 
and  they  attempted  to  execute  as  many  constructions  as  possible  with  the 
straight-edge  alone. 

Now,  with  the  use  of  a  straight-edge  alone  it  is  possible  to  represent  only 
those  algebraic  expressions  whose  algebraic  form  is  rational  (thus,  for  example, 
it  is  impossible  to  represent  expressions  such  as  Vab)-  This  circumstance 
suggested  to  Poncelet  that  an  additional  fixed  circle  (as  well  as  the  center!)  must 
be  given  inside  the  picture  plane  for  it  to  be  possible  to  draw  with  straight-edge 
alone  all  the  algebraic  expressions  that  can  be  constructed  with  a  compass  and 
straight-edge. 

This  suggestion  was  confirmed  as  a  certainty  by  Jakob  Steiner  (1796-1863), 
the  greatest  geometer  since  the  days  of  Apollonius,  in  his  celebrated  book  Die 
geometrischen  Konstruktionen  ausgefuhrt  mittels  der  geraden  Linie  und  Eines 
festen  Kreises  (Geometrical  Constructions  Executed  with  a  Straight  Line  and 
One  Fixed  Circle),  published  in  Berlin,  1833. 

The  solution  presented  here  is  based  upon  that  in  Steiner’s  book,  except  that 
we  have  here  eliminated  everything  that  is  not  strictly  essential  for  the  purpose  at 
hand,  and  we  have  also  made  it  somewhat  more  elementary  by  dispensing  with 
the  theorems  of  homothety  and  chordals  employed  by  Steiner. 

Since  in  straight-edge  geometry  the  intersection  of  two  straight  lines  is 
known  directly,  we  need  only  demonstrate  that  the  two  fundamental  problems  II. 
and  III.  of  the  previous  section  can  be  solved  by  means  of  a  straight-edge  and  a 
fixed  circle  alone. 

As  in  the  solution  of  Mascheroni’s  problem,  we  must  first  solve  several 
preliminary  problems;  in  this  case  there  are  five  rather  than  two. 
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Preliminary  problem  1 :  To  draw  through  a  given  point  the  parallel  to  a 
given  line. 

Steiner  distinguishes  two  cases:  la.  construction  of  the  parallel  to  a  directed 
straight  line;  1  b.  construction  of  the  parallel  to  an  arbitrary  straight  line. 

1  a.  A  directed  straight  line  is  understood  to  mean  a  straight  line  in  which  two 
points  A  and  B  and  the  midpoint  M  of  the  segment  joining  them  are  known.  In 
order  to  draw  the  parallel  to  such  a  line  through  a  given  point  P,  we  draw  AP, 
choose  a  point  S  on  the  extension  of  AP,  connect  this  point  with  B  and  M,  draw 
BP,  and  draw  the  straight  line  AO  through  the  point  of  intersection  O  of  BP  and 
MS  in  such  a  manner  that  AO  cuts  BS  at  Q.  PQ  is  then  the  desired  parallel.  A 
simple  proof. 


P 


FIG.  24. 

lb.  We  connect  a  given  point  M  of  the  given  straight  line  g  with  the  midpoint 
F  of  the  given  fixed  circle  5  and  designate  the  points  of  intersection  of  the 
connecting  line  and  31  as  U  and  V.  The  points  U,  F,  V  make  the  line  FM  a 
directed  line.  In  accordance  with  la.,  we  draw  a  parallel  to  FM  in  such  a  manner 
that  it  cuts  5  at  X  and  Y  and  g  at  A.  If  we  then  draw  the  diameters  XFX  and  YFY 
and  connect  the  end  points  X  and  Y,  the  connecting  line  intersects  the  given  line 
at  a  point  B  in  such  a  manner  that  MA  =  MB  and  g ,  defined  by  the  three  points  A, 
M,  B,  is  then  a  directed  line.  This  makes  it  possible  to  determine  the  parallel  to  g 
in  accordance  with  la. 


Preliminary  problem  1  gives  us  the  solution  to  the  problem:  shift  a  given 
segment  AB  parallel  to  itself  in  such  a  manner  that  one  of  its  end  points  lies  on  a 
given  point  P. 

If  P  falls  outside  the  straight  line  AB  we  find  the  point  of  intersection  Q  of  the 
parallel  through  B  to  AP  and  the  parallel  through  P  to  AB;  PQ  is  then  parallel  to 
AB. 

Preliminary  problem  2:  Draw  a  perpendicular  through  a  given  point  P  to  a 
given  straight  line  9 . 

We  draw  9 '  parallel  to  9  in  such  a  manner  that  it  cuts  $  at  U  and  V.  We  then 
draw  the  diameter  UFU  and  the  chord  VU  which,  according  to  Thales’  theorem, 
is  perpendicular  to  g'  and  consequently  also  perpendicular  to  g.  Finally,  we  draw 
the  parallel  to  VU  through  P  in  accordance  with  1;  this  parallel  is  the  desired 
perpendicular. 


Preliminary  problem  3:  To  lay  off  a  given  distance  PQ  from  a  given  point  O 
in  a  given  direction. 

Let  us  consider  the  prescribed  direction  as  given  by  the  segment  OH  from  O. 
First,  in  accordance  with  1.,  we  displace  PQ  parallel  to  itself  to  OK.  Then  from  F 
we  draw  two  radii  FU  and  FV  in  the  directions  OH  and  OK.  Finally,  if  we  draw 
through  K  the  parallel  to  UV,  the  point  of  intersection  S  of  the  parallel  with  the 
line  OH  gives  the  end  point  of  the  desired  segment. 

Preliminary  problem  4:  If  three  distances  m,  n,  s  are  given,  draw  the  fourth 
proportional. 

From  any  point  O  we  draw  two  rays  1  and  11,  mark  off  the  two  distances  OM 
=  m  and  ON  =  n  on  I  and  the  distance  OS  =  s  on  11;  we  draw  the  parallel  to  MS 
through  N  and  designate  its  point  of  intersection  with  11  as  X.  Then 


ox  = 


It 

—  s 
m 


is  the  desired  fourth  proportional. 

Preliminary  problem  5:  If  two  segments  a  and  b  are  given ,  draw  the  mean 
proportional. 

We  designate  the  sought-for  mean  proportional  (fab)  as  x,  the  diameter  of  the 
fixed  circle  as  d,  the  sum  a  +  b  that  can  be  constructed  according  to  3.  as  c,  and 
we  write 


x  s,  with 
a 


s  = 


Vhk,  h  = -  a,  k 

c 


c 


(so  that  h  +  k=  d). 

First,  in  accordance  with  4.,  we  draw  the  segments  h  and  k,  and  in  accordance 
with  3.,  we  make  HO  =  h  on  a  diameter  HK  of  the  fixed  circle,  so  that  KO  will 
necessarily  equal  k.  Then,  according  to  2.,  we  construct  through  O  the 
perpendicular  to  HK  and  call  the  intersection  of  the  perpendicular  with  the  fixed 
circle  S.  Then  OS  =  y/Jk  =  s-  Finally,  we  draw  the  desired  segment  x(=  ( c/d)s ) 
according  to  4. 

Now  that  we  have  solved  these  five  preliminary  problems,  the  solution  of  the 
two  basic  problems  II  and  III  is  simple. 

Basic  problem  II:  To  draw  the  points  of  intersection  of  a  given  line  and  a 
given  circle. 

In  straight-edge  geometry  a  circle  is  considered  determined  if  its  center  and 
radius  are  known.  Let  us  designate  the  given  circle  as  ft,  its  center  as  C,  its 
diameter  as  r,  the  given  straight  line  as  9,  the  points  of  intersection  of  g  with 
circle  ft  as  X  and  Y,  the  chord  of  intersection  as  2s,  the  midpoint  of  the  chord  as 
M,  its  distance  from  the  center  C  as  /.  From  the  right  triangle  CMX  we  obtain  the 
equation 


s2  =  ra  —  l2  or  s  =  V (r  +  l)(r  —  /). 

Then,  in  accordance  with  2.,  we  drop  the  perpendicular  CM=  l  to  g;  we  draw 
the  segments  a  =  r  +  /  and  b  =  r  -  /  in  accordance  with  3.;  then,  according  to  5., 
we  draw  the  segment  s  =  Vab',  and  finally,  according  to  3.,  we  lay  off  s  from  M 
on  9  in  both  directions.  The  end  points  of  the  laid-off  segments  are  the  desired 
points  of  intersection  X  and  Y 


FIG.  26. 


Basic  problem  III:  Find  the  points  of  intersection  of  two  given  circles. 

Let  us  designate  the  circles  as  91  and  50,  their  midpoints  as  A  and  B,  their  radii 
as  a  and  b,  the  line  AB  joining  their  centers  as  c,  the  sought-for  points  of 
intersection  as  X  and  Y,  the  point  of  intersection  of  the  chord  XY  with  the  center 
line  AB  as  O,  and,  finally,  the  unknown  segments  AO  and  OX  as  q  and  x. 

Finding  q.  From  the  triangle  ABX  it  may  be  inferred,  in  accordance  with  the 
expanded  Pythagorean  theorem,  b2  =  c2  +  a2  -  2 cq;  thus,  if  we  set  c2  +  a2  equal 
to  d2, 

(d  +  b)(d  -  b) 
q  = - 2c - 

Consequently,  we  draw,  in  accordance  with  2.  and  3.,  a  right  triangle  with  the 
short  legs  a  and  c  and  obtain  as  the  hypotenuse  d. 

Then,  according  to  3.,  we  draw  the  segments 

n  =  d  +  b,  m  =  2c,  s  =  d  —  b 


n 


and  finally,  according  to  4., 


Finding  X.  From  A OAX  it  follows,  according  to  the  Pythagorean  theorem, 
that  x2  =  a2  -  q2\  thus 


*  =  V(a  +  q)(a  -  q). 

According  to  3.,  we  draw  h  =  a  +  q,k  =  a-  q  and,  according  to  5., 

x  =  Vkk. 

Construction  of  X  and  Y.  According  to  3.,  we  lay  off  q  from  A  on  AB.  At  O, 
the  end  of  the  segment  laid  off,  we  erect  the  perpendicular  to  AB  in  accordance 
with  2.  and  (according  to  3.)  we  lay  off  x  on  it  in  both  directions.  The  end  points 
of  the  laid-off  segments  are  the  points  of  intersection  that  we  are  looking  for. 
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The  Delian  Cube-doubling  Problem 


To  construct  the  edge  of  a  cube  that  is  double  the  size  of  a  given  cube. 


The  name  “Delian  problem,”  according  to  an  account  given  by  the 
mathematician  and  historian  Eutocius  (sixth  century  a.d.),  goes  back  to  an  old 
legend  according  to  which  the  Delphic  oracle  in  one  of  its  utterances  demanded 
that  the  Delian  altar  block  be  doubled. 

If  k  is  the  edge  of  the  given  cube  and  x  the  edge  of  the  cube  we  are  seeking, 
the  respective  volumes  of  the  two  cubes  are  k 3  and  x3.  Consequently  we  are 
confronted  with  the  problem  of  finding,  when  the  segment  k  is  given,  a  second 
segment  x  such  that 


x3  =  2k3. 

This  problem  is  not  capable  of  solution  with  compass  and  straight-edge.  (See  the 
Supplement  to  No.  36.) 

The  numerous  solutions  to  this  problem,  some  of  which  were  found  in 
antiquity,  consequently  make  use  of  more  advanced  means. 

Thus,  the  solution  of  the  Greek  mathematician  Menaechmus  (ca.  375-325 
b.c.)  is  based  upon  finding  the  point  of  intersection  of  the  two  parabolas 

(1)  X2  =  ky  and  (2)  y2  =  2kx 

with  the  parameters  k  and  2k.  The  abscissa  x  of  the  point  of  intersection  satisfies 


the  condition  x3  =  2k 3  as  a  result  of  the  fact  that  x4  =  k2y 2  =  2k3x,  and  the  sought- 
for  edge  x  is  thereby  obtained. 

Descartes  (1596-1650)  showed  that  one  of  the  two  parabolas  (1)  and  (2)  was 
sufficient.  For  their  point  of  intersection  x\y  the  following  equation  is  also  true: 

x2  +  y2  =  ky  +  2  kx; 

and  this  is  the  equation  of  a  circle  with  the  midpoint  coordinates  k  and  k/2  which 
passes  through  the  common  apex  of  the  two  parabolas.  Thus,  it  is  only  necessary 
to  find  the  intersection  of  this  circle  with  one  of  the  two  parabolas  to  find  the 
sought-for  point  of  intersection. 


D 


FIG.  27. 


The  simplest  and  most  accurate  method  of  constructing 

x  =  ki/ 2 


is  by  paper  strip  construction.  1.  We  draw  an  equilateral  triangle  ABC  with  the 
side  k,  extend  CA  by  AD  =  k,  and  draw  the  line  DB.  2.  We  mark  off  on  the  sharp 
edge  of  a  paper  strip  the  distance  k.  3.  We  place  the  paper  strip  in  such  a  way  that 
the  edge  passes  through  C  and  the  end  points  of  the  marked-off  distance  fall 
upon  two  points  P  and  Q  of  the  extensions  of  AB  and  DB. 

Then 


CQ  =  x  =  kV 2. 


Proof.  Let  CQ  =  x,  BP  =  y.  According  to  the  leg  transversal  theorem  used  in 
figure  CABP,  (x  +  k )2  -k2  =  y(k  +  y)  or 

(I)  x2  +  2  kx  —  y2  +  ky. 


According  to  the  theorem  applied  by  Menelaus  to  the  triangle  ACP  with  the 
transversal  DBQ ,  AD  ■  CQ  ■  BP  =  PQ  ■  AB  ■  CD  or 

(II)  xy  =  2k3. 

A  glance  at  equations  (I)  and  (II)  shows  that  they  are  satisfied  by  the  roots  x 
and  y  of  equations  (1)  and  (2).  The  unknowns  x  and  y,  which  are  determined  by 
(I)  and  (II),  are  therefore  at  the  same  time  the  coordinates  of  the  point  of 
intersection  of  Menaechmus’  parabolas.  In  particular,  x  =  k$/ 2 . 

Naturally,  this  result  can  also  be  obtained  without  reference  to  these 
parabolas. 

Note.  The  doubled  cube  can  also  be  constructed  by  means  of  the  so-called 
conchoid  of  Nicomedes,  a  Greek  mathematician  who  lived  at  the  beginning  of 
the  second  century  b.c.;  we  cannot,  however,  present  this  construction  here. 
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Trisection  of  an  Angle 


To  divide  an  angle  into  three  equal  angles. 

This  famous  problem  cannot  be  solved  with  compass  and  straightedge  (see 
the  supplement). 

The  simplest  solution  is  by  means  of  the  following  paper  strip  construction 
of  Archimedes 


Taking  as  the  center  the  apex  S  of  the  angle  <X>  to  be  trisected,  we  draw  a 
circle  of  radius  r  that  intersects  the  legs  of  the  angle  at  A  and  B.  We  mark  off  a 
segment  of  length  r  on  the  edge  of  a  paper  strip.  We  place  the  edge  on  the  figure 
in  such  a  way  that  it  passes  through  B  and  that  one  end  point  of  the  marked-off 
segment  coincides  with  a  point  P  on  the  circle,  while  the  other  end  point 
coincides  with  a  point  Q  (outside  the  circle)  of  the  extension  of  AS.  Then  &PQS 
=  (pis  one  third  of  the  given  angle  O. 


Proof.  Since  PS  =  PQ  (=  r),  A PQS  is  isosceles  and  &PSQ  is  therefore  also 
equal  to  (p,  while  the  external  angle  &SPB  is  equal  to  2 cp.  Since  A SPB  is  also 
isosceles,  &SBP  =  &SPB  =  2 cp.  Finally,  since  the  external  angle  O  at  S  of  the 
triangle  SBQ  is  equal  to  the  sum  of  the  two  nonadjacent  internal  angles  SQB  and 
SBQ,  we  find  that  O  =  cp  +  2(p  or 

9 >  =  $<J>.  Q.E.D. 

The  problem  of  the  trisection  of  an  angle  can  also  be  solved  by  means  of  a 
fixed  hyperbola,  as  the  Greek  mathematician  Pappus  (ca.  300  a.d.)  demonstrated 
in  his  ingenious  masterwork  Swayuiyal  nad^aTiKai  ( Collectiones  mathematical). 

In  order  to  understand  the  construction  we  must  first  solve  the  problem:  Find 
the  locus  of  the  vertex  P  of  a  triangle  ABP  with  fixed  base  AB  when  the  base 
angles  a  and  ft  are  to  each  other  in  the  proportion  of  2  to  1. 

Let  AB  =  3k,  AP  =  u.  We  lay  off  the  angle  f  at  P  on  PB  and  designate  the 
point  of  intersection  of  the  free  leg  with  segment  AB  as  Q.  The  triangles  BPQ 
and  APQ  are  then  isosceles  ( ?  AQP  as  the  external  angle  of  BPQ  is  equal  to  2/i  = 
a);  consequently,  AP  =  QP  =  BQ  =  u.  We  then  extend  AB  by  BC  =  k  and  set  CP 
equal  to  v.  From  figure  AQCP  it  then  follows,  according  to  the  apex  transversal 
theorem,  that 


v3  -  u2  =  CA  CQ  =  4 k(k  +  a) 


or 


r3  =  (a  +  2 it)3, 


more  simply 


v  =  u  +  2k 


or  also 


v  —  a  =  2k. 

This  is  the  equation  for  the  locus  in  bipolar  coordinates  u,  v. 

The  locus  of  the  point  P  is  thus  a  hyperbola  with  the  foci  A  and  C  and  the 
major  axis  BD  =  2k.  (D  lies  between  A  and  B  in  such  a  way  that,  according  to 
the  locus  equation  w-u  =  2k,  CD  =  3k,  and  AD  is  equal  to  k.) 

Let  us  now  consider  this  hyperbola  as  having  been  drawn  once  and  for  all  for 
any  k.  (The  half  of  the  branch  belonging  to  the  focus  A,  lying  above  the  major 


axis,  is  sufficient.) 

In  order  to  trisect  the  prescribed  angle  co  we  draw  about  AB  as  chord  the  arc 
subtending  the  angle  180°  -  co  and  call  its  intersection  with  the  hyperbola  P. 
Then 


&ABP  =  p  = 


Proof.  From  &APB  =  180°  -  co  it  follows  that  a  +  J3  =  co,  i.e.,  (because  a  = 
2/3),  3/3  =  co. 

Note.  It  is  also  possible  to  trisect  an  angle  by  means  of  Nicomedes’ 
conchoid;  this  method,  however,  now  possesses  only  historical  interest. 

Supplement  to  Nos.  35,  36,  and  37 

On  the  degree  of  irreducible  equations  that  can  be  solved  by  quadratic  roots: 

Let  a  rational  function  of  one  or  more  magnitudes  be  known  as  an  9?  -function 
and  an  algebraic  equation  with  rational  coefficients  as  an  9? -equation;  in 
particular,  let  us  designate  an  integral  rational  function  of  several  magnitudes 
with  rational  coefficients  as  an  9?  -polynomial.  We  will  also  call  a  quadratic  root 
of  a  rational  number  or  an  9?  -function  of  such  quadratic  roots  an  expression  of 
the  first  order,  and  a  quadratic  root  of  an  expression  of  the  first  order  or  an  9?  - 
function  of  such  quadratic  roots  an  expression  of  the  second  order,  etc. 

In  every  expression  of  the  mth  order  we  assume  that  none  of  its  roots  of  the 
mth  order  can  be  expressed  rationally  by  the  remaining  ones  or  even  by 
expressions  of  lower  than  the  mth  order;  we  assume  as  well  that  the  expression 
(by  elimination  of  irrational  denominators  and  powers  higher  than  the  first  of  the 
relevant  quadratic  roots)  has  been  put  into  its  simplest  form — the  normal  form. 
An  expression  of  the  mth  order  that  contains  the  root  of  the  mth  order  y«  will 
thus  appear  in  the  form  a  +  aVa,  where  a  and  a  are  expressions  of  the  mth  order 
(or  lower)  in  which  the  Va  does  not  recur. 

Now  let  Xi  be  an  expression  of  the  mth  order  which  contains  the  mth-order 
roots  Va,  Vp,  Vy, . . .  and  in  which  a  total  of  n  different  roots  [of  mth  and  lower 
order]  occur.  If  we  change  the  signs  of  these  n  roots  in  every  possible  way,  we 
obtain  a  total  of  2n  =  N similarly  constructed  root  expressions  xh  x2,  x3,  ...,  xN, 

We  form  the  function 


F(x)  =  (x  -  xj(x  -  **)  . . .  (x  -  xN ). 


If  everywhere  in  this  expression  we  change  the  sign  of  any  of  the  above  n  roots 
contained  in  it,  the  value  of  the  expression  is  not  changed.  Thus,  if  we  multiply 
out  the  parentheses,  the  resulting  polynomial  of  x — as  we  know  from 
computations  with  root  expressions — will  merely  contain  the  squares  of  the  roots 
and  is  consequently  an  5?  -function  of  x.  The  equation 

(1)  F(x)  =  0 

is  thus  an  9? -equation  with  the  roots  x1?  x2,  . . .,  xN,  which  moreover  need  not  all 
be  different. 

We  now  postulate: 

If  an  9?  -polynomial  f(x)  vanishes  for  a  null  value ,  such  as  x1?  of  F(x),  then 
f(x)  will  vanish  for  all  the  roots  o/F(x)  =  0. 

Proof.  We  write  xY  =  a  +  aVa  (see  above)  and  introduce  this  value  into  /(x), 
and  on  computation  we  obtain 


0  -/(jq)  =  91  +  AVZ, 


where  91  and  A  contain  expressions  of  the  /nth  degree  and  lower  with  the 
exception  of  Vo"-  Now,  since  it  is  assumed  that  is  independent  of  these 
expressions,  A  cannot  differ  from  zero  (for  otherwise  it  would  follow  that  Va  = 
-91  /A  and  thus  Va  would  be  a  function  of  Vp,  Vy, . . .)  and,  therefore,  necessarily 

A  =  0  and  91  =  0. 

We  will  write  the  expressions  A  and  91  as  b  and  58  +  BVp,  where  b,  b,  93,  B  are 
no  longer  dependent  upon  Va  and  Vp,  From 

b  +  bVp  =  0  and  58  +  Bx  p  =  0 


it  follows  as  above  that 

b  =  0,  b  =  0,  58  =  0,  B  =  0, 

etc.  From  these  values  we  finally  obtain  equations  that  possess  no  roots  but  only 
rational  numbers  and  which  are,  in  other  words,  independent  of  the  signs  of  the  n 
roots  occurring  in  xT  and  consequently  are  unchanged  when  the  signs  are 
changed  in  any  way.  Now,  since  this  change  of  sign  transforms  x,  into  one  of  the 
values  x2,  x3,  ...,  xN,J[x )  must  therefore  also  vanish  for  x2,  x3,  ...,  xN,  which  is 
what  we  set  out  to  prove. 


Among  all  the  9?  -polynomials  f{x)  that  vanish  for  x  =  there  is  one 
possessing  the  lowest  possible  degree  v;  let  this  be  called  (p(x). 

The  polynomial  (p{x)  is  irreducible  in  the  natural  rationality  domain  (cf.  No. 
24). 

[If  (p  were  divisible:  cp(x)  =  u(x)  ■  v(x),  then  when  (p{xf)  =  0  it  would 
necessarily  follow  that  one  of  the  factors  such  as  v^)  must  equal  zero:  this 
would  contradict  our  assumption  in  that  there  would  be  a  polynomial  v  of  lower 
degree  than  cp  with  the  null  value  xv] 

Since  the  3?  -polynomial  F(x)  vanishes  for  a  null  value  x1  of  the  irreducible 
polynomial  cp(x),  F{x),  according  to  Abel’s  irreducibility  theorem  (No.  25),  is 
divisible  by  cp(x): 

F(x)  =  Fx(x)9>(x). 

Since,  moreover,  the  SR -polynomial  A,  (x)  vanishes  for  a  null  value  of  F,  thus 
also  for  cp ,  F]  is  also  divisible  by  cp  and  A:(x)  =  F2(x)^(x);  consequently 

F(x)  =  F2(x)9(xy, 


etc.  Finally  we  obtain 


F(x)  =  <p(x)* 

(assuming  that  the  first  coefficient  of  F  and  (p  has  the  value  1). 

If  we  compare  the  degree  of  the  polynomial  on  the  right-hand  side  of  this 
equation  with  that  of  the  polynomial  on  the  left,  we  find  that 

N  —  fiv. 

Since,  however,  N=  2’\  v  must  also  be  a  power  of  2. 

Conclusion:  The  degree  of  an  irreducible  equation  with  rational  coefficients 
for  which  a  single  expression  formed  from  quadratic  roots  will  suffice  must  be  a 
power  of  2.  From  this  the  two  following  theorems  are  easily  obtained: 

I.  It  is  impossible  to  double  a  cube  with  compass  and  straight-edge. 

II.  It  is  in  general  impossible  to  trisect  an  angle  with  compass  and  straight¬ 
edge. 

In  both  problems  the  specific  magnitude  x  to  be  constructed  is  a  root  of  an 
irreducible  equation  of  the  third  degree,  and  according  to  our  conclusion  it  is 


impossible  for  such  an  equation  to  be  constructed  from  quadratic  roots,  and 
therefore  with  compass  and  straight-edge.  [As  is  well  known,  all  expressions  that 
can  be  represented  by  compass  and  straight-edge  constructions  are  either  rational 
or  built  up  from  quadratic  roots.] 

Thus  it  merely  remains  to  show  that  the  equations  for  doubling  a  cube  and 
trisecting  an  angle  are  cubic  and  irreducible. 

The  edge  x  of  the  cube  that  is  twice  the  size  of  a  cube  with  an  edge  equal  to  1 
satisfies  the  equation 


x3  —  2  =  0. 

If  this  equation  were  reducible,  then  it  would  necessarily  follow  that 

x3  —  2  =  (x2  +  hx  +  k){x  —  l), 

where  h,  k,  l  are  rational  numbers.  Accordingly,  the  equation  x3  =  2  would  have 
to  possess  the  rational  root  /  =  plq ,  where  we  may  assume  that  p  and  q  have  no 
common  divisor,  and  consequently  {Plq)3  would  have  to  be  equal  to  2  or p3  equal 
to  2 q3.  Consequently,^3  would  have  to  be  divisible  by  q3  and  therefore p  would 
also  have  to  be  divisible  by  q,  which  is  not  the  case. 

In  the  trisection  of  an  angle  we  can  consider  the  given  angle  a  and  the  angle 
we  are  looking  for  cp  as  peripheral  angles  of  a  unit  circle,  so  that  the  subtended 
arcs  are  a  =  2  sin  a  and  x  =  2  sin  (p ,  respectively.  From  a  =  2xp  and  sin  3^  =  3  sin 
(p  -  4  sin3  (p  it  follows  that 


sin  a  =  3  sin  <p  —  4  sin3  <p 


or 


x3  -  3x  +  a  =  0. 

If  we  assume  an  arc  a  of  length  3 min,  where  m  and  n  possess  no  common 
divisors  and  are  integers  that  cannot  be  divided  by  3,  and  if  we  multiply  the 
equation  by  n3  and  set  nx  =  X,  the  equation  assumes  the  form 

X3  -  3 n2X  -h  3  mn2  =  0. 


But  according  to  Schoenemann’s  theorem  (No.  25)  this  equation  is  irreducible, 
since  the  coefficient  of  X  is  divisible  by  the  prime  number  3  and  the  free  term  is 
divisible  by  3,  but  not  by  32. 
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The  Regular  Heptadecagon 


To  construct  a  regular  heptadecagon. 

In  other  words:  To  divide  the  perimeter  of  a  circle  into  17  equal  parts. 

This  celebrated  problem  was  solved  by  Gauss  in  his  major  work 
Disquisitiones  arithmeticae,  published  in  1801.  In  the  section  of  this  work 
dealing  with  the  solution  of  the  binomial  equations  xn  =  1  Gauss  proved  the 
important  theorem: 

A  regular  polygon  can  be  constructed  with  compass  and  straight-edge  when 
and  only  when  the  number  of  its  sides  has  the  form  2mp1p2  ...  pv,  where  P1?  p2, 
. . .,  pv  are  all  different  prime  numbers  of  the  form  2n  +  1 . 

For  m  =  0,  v  =  1,  and  p\  =  3  and  px  =  5,  we  obtain  the  cases  of  the  regular 
triangle  and  pentagon,  respectively,  which  had  already  been  solved  in  antiquity. 

In  the  conclusion  to  his  investigations  Gauss  said,  “The  division  of  a  circle 
into  three  and  into  five  equal  parts  was  already  known  in  Euclid’s  time;  it  is 
amazing  that  nothing  new  was  added  to  these  discoveries  in  the  next  two 
thousand  years,  that  the  geometers  considered  it  as  confirmed  that,  except  for 
these  cases  and  those  that  could  be  derived  from  them,  regular  polygons  could 
not  be  constructed  with  compass  and  straight-edge.” 

The  great  advances  made  in  the  division  of  the  circle  by  Gauss  were  possible 
only  because  Gauss  transformed  the  originally  purely  geometrical  problem  into 
an  algebraic  one.  He  arrived  at  this  transformation  in  the  course  of  his 
representation  of  complex  numbers  in  the  Gauss  plane,  which  was  named  after 
him. 

An  arbitrary  complex  number  c  =  a  +  bi  is  conventionally  represented  in  this 
plane  by  a  point  with  the  coordinates  a\b;  this  point  itself  is  designated  as  “the 
complex  number  c.  ”  Another  common  method  is  the  trigonometric 
representation 


c  =  r(cos  &  +  i  sin  &) 

of  the  complex  number  c,  where  r  represents  the  so-called  magnitude  (modulus) 
of  the  number,  the  distance  of  the  number  c  from  the  null  point  O  of  the  number 
plane  and  #,  the  so-called  angle  of  the  number,  which  is  the  angle  formed  by  the 
distance  r  and  the  axis  of  the  positive  real  numbers. 

The  points  of  the  unit  circle  ft  drawn  about  the  center  O  represent  the  so- 
called  Gauss  numbers,  i.e.,  numbers  of  the  form 


y  =  cos  <p  +  ism  <p, 


where  (p  is  the  angle  of  the  number  y. 
We  will  write  for  short 


cos  tp  +  i  sin  <p  =  l^p. 

The  fundamental  property  of  the  Gauss  numbers  is  described  by  the  relation 

h'  h  * 

i.e.,  the  product  of  two  Gauss  numbers  is  also  a  Gauss  number;  the  angle  of  the 
product  is  the  sum  of  the  angles  of  the  factors. 

It  is  easily  confirmed  that  the  theorem  also  holds  for  products  of  more  than 
two  Gauss  numbers. 

For  example, 

15  =  =  1*», 

or,  written  out  fully, 

(cos  <p  +  i  sin  <p)n  =  cos  rup  +  i  sin  rup. 

This  is  Demoivre’s  formula  (Abraham  Demoivre,  1667-1754). 

To  obtain  a  regular  polygon  of  n  angles  we  mark  off  the  angle  cp  =  (Inin)  n 
times  in  succession  from  point  1  on  ft-  The  resulting  points  representing  the 
divisions  are 


ex  =  e  <=  cos  <p  +  i  sin  <p,  e2  =*  cos  2 <p  +  i  sin  295, . . . 
«„  =  cos  rup  +  i  sin  rup  =  1 . 


Then 


e,  *  «J  =  e*  and  =  (*")*  =  1. 

The  n  angles  £\,  s2,  ...,  en  of  a  regular  polygon  of  n  angles  are  therefore  the 
roots  of  the  equation 

zn  =  1. 

Thus  the  geometric  problem  of  “constructing  a  regular  polygon  of  n  angles,” 
following  Gauss,  turns  out  to  be  the  problem  “of  finding  the  roots  of  the 


equation  zn  =  1.” 

Since  one  of  the  n  roots  of  this  equation  has  the  value  1,  we  need  only  find 
the  other  (n  -  1)  roots.  These  satisfy  the  equation 


z*-l  +  zn-2  +  .  .  .  -|-  z3  _j_  z  q.  \  =  0, 


the  so-called  circle  partition  equation.  In  the  case  of  n  =  3,  for  example,  the 
equation  reads 


z2  +  z  - f-  1  =  0 


and  has  the  roots 


- 1  +  i  V3  - 1  -  i  V3 

*1 - 2 - ’  e2 - 2 - - 

Since  the  complex  numbers  e1  and  s2  both  possess  the  real  component  -I  the 
angles  and  s2  °f  the  regular  triangle  are  the  points  of  intersection  of  ft  with  the 
parallel  to  the  imaginary  number  axis  that  passes  through  the  point  -i 

A  proof  of  the  general  theorem  of  Gauss  would  take  us  too  far,  so  that  we  will 
restrict  ourselves  here  to  a  brief  exposition  of  the  basic  idea  and  the  elements 
that  are  necessary  for  an  understanding  of  the  construction  of  the  regular 
heptadecagon. 

Let  us  first  take  note  of  the  fact  that  the  construction  of  the  regular  2m/V-gon, 
where  N  is  the  product  of  the  odd  prime  numbers  p,  q,  r,  ...,  is  equivalent  to 
drawing  the  regular  ;>gon,  q- gon,  r-gon,  etc.  If  we  have  these  polygons,  we 
determine  the  integral  numbers  x,  y,  z  in  such  manner  that 


N  N 

-y  +  —  z  + 
9  r 


This  can  be  done  because  the  numbers 


I. 


N  N  N 

T  9  r' 


have  no  common  divisor.  Then 


so  that  the  Mh  part  of  ft  is  obtained  by  joining  the  x  />ths,  y  gths,  z  rths,  ...  of  the 
circle  perimeter. 

Consequently,  we  need  only  be  concerned  with  the  solution  of  the  circle 
partition  equation 

(!)  2p-i  +  2p-2  +  ...  +  za  +  z  +  1  =0, 


in  which  p  is  a  prime  number  of  the  form  2n  +  1 . 

The  brilliant  idea  underlying  Gauss’  method  of  solution  consists  in  grouping 
the  roots  bx,  s2,  sp_i  of  (1)  (where  £v  =  e\  =  ev,  £  =  cos  cp  +  i  sin  (p,  (p  =  2n/p) 
into  so-called  periods.  The  Gauss  periods  are  root  sums  in  which  each 
successive  term  is  the  gth  power  of  the  preceding  term,  and  the  gth  power  of  the 
last  sum  term  results  once  again  in  the  first  term  (hence  the  name  period).  The 
exponent  g  is  here  a  so-called  primitive  root  of  the  prime  number  p,  i.e.,  an 
integer  such  that  gf  ~  1  is  the  smallest  of  its  integral  powers  that  leaves  a  residue 
of  1  on  division  by  p.  In  other  words,  g  is  an  integer  such  that  the  roots  of  (1)  can 
be  expressed  in  the  form 

*0  =  *.  *1  "  «•»  *a  “  «*  >  •  •  •*  Zp-i  -  e9” 


The  next  period  is 

z0  +  zx  +  za  +  •  •  •  +  zp_a. 

In  fact, 

zv  +  i  =  z°  and  zj_2  =  e 9*~l  =  £s,,  +  1  (where  s  is  an  integer)  =  e. 

The  following  period  contains  only  a  =  {p-  l)/2  terms  and  reads 

*o  +  *2  +  +  •  •  *  +  Zt  (r  =  2a  -  2). 

In  this  period  each  term  is  the  Gth  power  of  the  preceding  term  and  ZG  =  z0> 

where  G  =  g2  is  similarly  a  primitive  root  of  p. 

Let 


b  —  c  =  \b,  d  =  etc. 


Gauss’  method  for  solving  the  circle  partition  equation  consists  of  reducing 

(1)  to  a  chain  of  groups  of  quadratic  equations.  The  first  group  contains  one,  the 
second  group  two,  the  third  group  four,  the  fourth  group  eight,  etc.,  and  the  last 
group  a  quadratic  equations.  The  roots  of  the  first  group  form  periods  of  a  terms, 
those  of  the  second  group  periods  of  b  terms,  those  of  the  third  periods  of  c 
terms,  those  of  the  last  periods  of  a  single  term,  i.e.,  the  roots  of  (1)  itself.  The 
coefficients  of  the  equations  of  one  group  can  be  determined  from  the 
coefficients  of  the  preceding  group,  so  that  the  equations  of  the  last  group  give 
us  the  roots  of  (1)  directly. 

In  the  successive  determination  of  coefficients  the  formula 

(2) 

in  which  r  represents  the  residue  remaining  when  the  integral  exponent  E  is 
divided  by  p,  plays  a  predominant  role. 

We  will  now  use  the  Gauss  method  to  solve  the  equation  for  the 
heptadecagon  (p  =  17). 


z16  +  z15  +  •  •  •  +  za  +  z  +  1  =  0. 

Let  cp  =  2tt/1  7,  s  =  ex  =  cos  cp  +  i  sin  cp ,  sv  =  sv,  and  accordingly,  let  sh  s2,  e3, 

v 

. . .,  e17  be  the  corners  of  the  heptadecagon,  for  which  zv  =  sg  ,  where  g  represents 

the  (smallest)  primitive  root  3  of  17.  The  powers  31,  32,  33,  ...,  316  on  division 
by  1 7  leave  the  residues 

3,  9,  10,  13,  5,  15,  11,  16,  14,  8,  7,  4,  12,  2,  6,  1. 

Consequently,  according  to  (2), 

Zq  —  e,  Z]  =  *  ,  Z^  =  {  ,  Zg  =  *  ,  Zg  —  t'®,  Zjq  = 

z12  =  e4,  z  i4  =  ea,  Zj  =  e3,  Zg  =  c10,  Z5  =  e6,  z?  =  e11, 
z9  =  «14,  Zn  =  e7,  z13  =  e13,  z,6  =  c6. 

Each  root  in  the  series  z0,  zh  z2,  ...  is  the  cube  of  the  preceding  one. 

The  first  group  in  the  chain  contains  a  quadratic  equation  the  roots  of  which 
are  the  periods 


X  —  Zq  +  "f  £4  +  Zg  +  Zg  +  Zio  ■+■  Zj j  +  Zn 

=  e  +  e°  -j-  «13  +  e18  +  e18  +  e8  +  t*  +  e2 


and 


*  =  Zj  +  Z8  +  Z#  +  Z7  +  Z9  +  Z11  +  zl9  +  zl9 
=  e3  +  *10  +  es  +  e11  +  «14  +  *7  +  «ia  +  e® 

Since  the  sum  of  the  roots  of  (1)  possesses  the  value  -  1,  we  obtain  the 
relation 


X  +  *  -  -1. 

Making  use  of  (2),  we  find  on  computation  that  Xx  is  equal  to  four  times  the  sum 
of  all  the  roots  of  (1),  and  consequently 

Xx  =  -4. 

The  quadratic  equation  for  the  periods  X  and  x  consequently  reads 
(I)  t3  +  t-  4  =  0. 

Its  roots  are 


X  = 


-1  +  VT7 


and 


x  = 


-1  -  VT7 


That  X  >  x  is  shown  in  the  following  manner.  If  we  designate  the  real 
component  of  the  complex  number  c  as  9?  c,  then  (cf.  Fig.  29) 

(3)  =  9Hev  if  ix  +  v  =  17, 

since  the  corners  ef*  and  sv  of  the  heptadecagon  are  symmetrical  to  the  real  axis. 
Applying  this  rule,  we  obtain 

=  2181*!  +  9U2  +  91*4  +  9ke]  > 

8tx  =  2(91*3  +  91*5  +  9t*e  +  91*7). 

A  glance  at  the  figure  shows  that  the  bracket  is  positive  and  the  parenthesis 
negative. 

The  four  four-term  periods  are 


U  =  z0  +  z4  +  z8  +  z12  =  e  +  e13  +  e16  +  e4, 

u  =  za  +  Z6  +  Z10  +  Z14  =  e9  +  c15  +  e* *  +  e3, 

V  =  Zi  +  ZS  +  Zd  +  Z13  =  e3  +  e5  +  eu  +  e13, 

V  =  z3  +  z7  -f  Zn  +  zl5  =  e10  +  e11  +  e7  +  e6. 


FIG.  29. 


Here  we  obtain 


U+u-X 


V  +  v  =  x 


and,  applying  rule  (2), 

Ull  =  e1  +  e3  +  •  •  •  +  e18  =  —  1  |  Vv  =  e1  +  e2  +  •  •  •  +  e18  = 


The  respective  quadratic  equations  are 

(II)  t2  —  Xt  —  1  =  0  j  t2  —  xt  —  1  -  0. 

Their  roots  are 


X+  VX2  +  4 

- 2 - 

x  -  Vx2  +  4 


x  +  Vat3  +  4 

2 


*  -  Vx2  +  4 


u  = 


2 


v  — 


2 


It  follows  from  rule  (3)  that  U  >  u  and  V  >  v.  Consequently, 


MU  rn  2[9tc1  +  MV  =  2[9tc3  +  9tes], 

SRm  =  2(9tca  ■+■  9tce)>  =  2(SHe6  +  JRct)* 

A  look  at  the  heptadecagon  shows  that  the  brackets  are  larger  than  the 
parentheses  immediately  below  them. 

Of  the  two-membered  periods  obtained  we  need  only  the  two 

W  =  z0  +  zg  =  e  +  e16  and  w  —  zt  +  z12  =  e13  +  e*. 


Here  we  find 


W  +  w  =  U 


and,  according  to  (2), 


Ww  =  c6  +  c14  +  e3  +  e12  =  V. 

Here  also  W>w,  since  W  W=  2ite1  and  3?  w  =  2ste4,  but  3?  e1  >  3? e4. 

The  quadratic  equation  with  the  roots  W  and  w  reads 

(III)  t2  —  Ut  +  V  =  0. 

The  construction  of  the  heptadecagon  accordingly  consists  of  the  following 
four  steps: 

I.  Construction  of  A  and  x; 

II.  construction  of  U  and  V; 

III.  construction  of  W  and  w  according  to  (III); 

IV.  finding  the  points  W  and  w  on  the  real  number  axis.  The  perpendicular 
bisectors  of  the  lines  joining  them  to  the  null  point  cut  the  circle  3?  at  the 
corners  eh  e16  and  e4,  e13  of  the  regular  heptadecagon  (thus  all  the  other 

corners  are  also  determined). 
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Archimedes’  Determination  of  the  Number  n 


Archimedes  of  Syracuse  (2877-212  b.c.)  was  the  greatest  mathematician  of 
the  ancient  world. 

The  most  famous  of  his  achievements  is  the  measurement  of  the  circle.  The 


crux  of  this  problem  is  the  calculation  of  the  number  iz,  i.e.,  the  number  by  which 
the  diameter  and  the  square  of  the  radius  must  be  multiplied  to  determine  the 
circumference  and  area,  respectively,  of  a  circle.* 

The  idea  upon  which  Archimedes’  method  was  based  is  the  following.  The 
circumference  of  a  circle  lies  between  the  perimeters  of  a  circumscribed  and 
inscribed  n-gon,  and  in  particular,  the  greater  n  is,  the  smaller  is  the  deviation  of 
the  circumference  of  the  circle  from  the  perimeters  of  the  two  /7-gons.  Then  the 
object  is  to  calculate  the  perimeters  of  a  circumscribed  and  inscribed  regular 
polygon  with  so  great  a  number  of  sides  that  their  difference  is  equal  to  a  very 
negligible  magnitude  e.  Then  if  the  circumference  of  the  circle  is  set  equal  to  the 
perimeter  of  one  of  these  polygons,  the  resulting  deviation  from  the  true 
circumference  of  the  circle  is  smaller  than  e,  with  the  result  that  when  s  is 
sufficiently  small  the  circumference  of  the  circle  is  determined  with  sufficient 
accuracy. 


The  particular  achievement  of  Archimedes  was  to  indicate  a  method  by 
which  the  perimeters  of  such  many-sided  polygons  could  be  calculated. 

This  method,  the  so-called  Archimedes  algorithm ,  is  based  upon  the  two 
Archimedes  recurrence  formulas  which  we  will  now  derive. 

In  Figure  30,  let  Z  be  the  center  of  the  circle,  let  AB  =  2 1  be  the  side  of  the 
circumscribed  and  CD  =  2s  the  side  of  the  inscribed  regular  n-gon.  Let  M  be  the 
midpoint  of  AB  and  N  the  midpoint  of  CD,  let  O  be  the  point  of  intersection  with 
MA  of  the  tangent  to  the  circle  passing  through  C.  Accordingly,  OM  =  OC  =  t'  is 
half  the  side  of  the  circumscribed  2«-gon  and  MC  =  MD  =  2s'  is  the  side  of  the 
inscribed  regular  2«-gon. 

Since  ACO  and  AMZ  are  similar  right  triangles, 


t'lit  -  0  =  OC/OA  =  MZ/AZ , 


and  from  the  ray  theorem, 


s/t  =  NC/MA  =  CZ/AZ. 

Since  the  right  sides  of  these  proportions  are  equal,  we  obtain  t'/(t-tr)  =  s/t  or 


Since  the  isosceles  triangles  CMD  and  COM  are  similar,  2s' Us  =  t'/2s\  i.e., 

2s'2  =  st\ 

If  a  is  the  perimeter  of  the  circumscribed  n-gon  and  b  the  perimeter  of  the 
inscribed  n-gon,  and  a'  and  b'  are  the  perimeters,  respectively,  of  the 
circumscribed  and  inscribed  2«-gons,  we  then  have 

a  —  2  nt,  b  =  2ns ,  a'  =  4  nt',  b'  =  ins'. 

If  we  then  introduce  the  values  obtained  for  t,  s,  t',  s'  from  these  equations  into 
the  two  formulas  we  have  found,  they  are  transformed  into  the  Archimedes 
recurrence  formulas  : 

(I)  “  JTV  (II)  *'  - y/w- 


Thus,  a'  is  the  harmonic  mean  of  a  and  b,  b'  the  geometric  mean  of  b  and  a'. 
Now  let  us  consider  in  succession  by  the  regular  n-gon,  2«-gon,  4«-gon,  8 n- 
gon,  etc.,  and  let  us  designate  the  perimeters  of  the  circumscribed  and  inscribed 
2v’/7-gons  as  av  and  bv,  respectively.  We  then  obtain  the  Archimedes  series 

a0>  ^0>  al)  ^1»  fl2>  ^2>  •  •  ■ 


of  the  successive  perimeters.  Here  the  recurrence  formulas  (I)  and  (II)  read 


(1) 


2 

«.  +  V 


(2) 


6,  +  |  “  y/ byQy,  +  J. 


That  is:  Each  term  of  the  Archimedes  series  is  alternately  the  harmonic  and 
geometric  mean  of  the  two  preceding  terms. 

Using  this  rule,  we  are  able  to  calculate  all  the  terms  of  the  series  if  the  first 
two  terms  are  known.  The  Archimedes  algorithm  consists  of  this  calculation  of 


the  successive  perimeters  of  the  polygons. 

Archimedes  chose  as  his  initial  polygon  the  regular  hexagon,  the  perimeters 
of  which  are  a0  =  4\  3r  and  b{)  =  6 r,  respectively,  and  worked  out  the  series  ab  bh 
a2 ,  b2,  a3,  b3,  aA,  bA  up  to  the  perimeters  aA  and  bA  of  the  circumscribed  and 
inscribed  regular  96-cornered  polygon.  He  found  that 

*4  -  34K  b<  =  3^K 

where  d  is  the  diameter  of  the  circle.  The  Archimedes  approximation  for  the 
value  of  n  is  consequently 


-  34  -  3.14. 

Note.  The  calculations  involved  in  the  Archimedes  method  are  very 
laborious.  For  this  reason  Christian  Huygens,  in  his  treatise  published  in  Leyden 
in  1654,  De  circuli  magnitudine  inventa,  replaced  the  limits  av  and  bv  of  the 
circumference  u  of  the  Archimedes  method  by  the  limits  av  and  fv,  which  gave  a 

closer  approximation  of  u,  since  it  made  it  possible  to  obtain  n  correctly  to  two 
decimal  places  for  v  =  1.  Huygens’  method,  however,  involves  rather 
complicated  considerations.  The  following  method  supplied  by  the  author  is 
faster  and  more  convenient;  it  is  based  on  the  known  theorem:  The  harmonic 
mean  of  two  numbers  is  smaller  than  the  geometric  mean  of  the  numbers.  This 
can  be  expressed  as 


2xy 
x  +  y 


<  y/xy. 


[Since  (Vx  -  Vy)2  >  0,  it  follows  that  2\'xy<x+y,  and  from  this, 
multiplication  with  Vxyl(x  +  y)  gives  the  designated  inequality.] 

According  to  this  theorem,  we  obtain  from  (1)  av  +  1  <  VajTv.  If  we  multiply  the 
square  of  this  inequality  by  the  square  of  (2),  we  obtain 


or,  if  we  set 


$  ajb*  —  Ay 


then 


(3)  A,^  <  Av. 

According  to  the  same  theorem,  it  follows  from  (2)  that 


2  1  1 

>  i  or  - —  <  7-  +  - — 

o,  +  *v+i  a»+x 

If  we  then  add  to  this  inequality  the  equation 

2  -  1  4-  1 

1 -  T  +  T’ 

a,  oY 

which  is  only  a  different  manner  of  writing  ( 1 ),  we  obtain 


or 


3fly  +  l^V  +  l  _  3fly by 

'  ■  ■  ^  t  ■  ) 

2flV4l  +  ^,4J  2a,  + 


or,  in  abbreviated  form,  if  we  set 

3a,6,  _  n 

2a,  +  A,  “  v’ 


then 

(4)  5,4  1  >  *V 

The  inequalities  (3)  and  (4)  imply  that  as  v  increases,  Av  grows  continuously 
smaller,  Bv  continuously  larger. 

Since  for  infinitely  great  v,  both  Av  and  Bv  become  the  circumference  u  of  the 
circle,  for  every  finite  v  it  must  be  true  that 

By  <  U  <  Ay. 

The  limits  Av  and  Bv  of  this  inequality  are  much  narrower  than  the  Archimedes 
limits  av  and  bv.  If  we  take  the  hexagon,  for  example,  as  our  initial  polygon  and 
d=  1,  then  a0  =  2V3,  b0  =  3,  u  =  n,  and  we  obtain  A1  =  3.1423  and  B0  =  3.1402; 


thus  we  are  able  to  obtain  the  correct  value  of  n  to  two  accurate  decimal  places 
by  using  only  the  inscribed  hexagon  and  the  circumscribed  dodecagon,  whereas 
the  same  precision  is  achieved  by  the  Archimedes  method  only  with  the  use  of 
the  polygon  of  96  sides. 
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the  relation  between  the  radii  and  the 


To  find  the  relation  between  the  radii  and  the  line  joining  the  centers  of  the 
circles  of  circumscription  and  inscription  of  a  bicentric  quadrilateral. 


A  bicentric  or  chord-tangent  quadrilateral  is  defined  as  a  quadrilateral  that  is 
simultaneously  inscribed  in  one  circle  and  circumscribed  about  another.  Let 
PQRS  be  such  a  quadrilateral,  £  the  circumscribed  circle,  T  the  inscribed  circle. 
Let  the  points  of  tangency  of  the  opposite  sides  PQ  and  RS  with  circle  T  be  X 
and  X'  let  the  points  of  tangency  of  the  opposite  sides  QR  and  SP  be  Y  and  Y\ 
and  let  the  point  of  intersection  of  the  tangency  chords  XX'  and  YY'  be  O.  If  we 
then  apply  the  theorem  of  the  sum  of  the  angles  of  a  quadrilateral  to  the  two 
quadrilaterals  OXPY  and  OX'RY',  designating  the  quadrilateral  angles  by  means 
of  a  line  over  the  letter  representing  the  corner,  we  obtain  the  two  equations 


0+X+P+Y=  360°,  0  +  X'  +  R  +  Y'  -  360°. 

Since  the  angles  X  and  X'  (F  and  Yj  situated  at  opposite  sides  of  the  chord  XX 
{YY)  add  up  to  180°,  addition  of  the  two  equations  gives  the  following  relation 


(1) 


2  0  +  P  +  R  =  360°. 


Now  the  sum  of  the  two  opposite  angles  p  and  R  of  the  chord  quadrilateral 
PQRS  is  180°;  consequently,  q  =  90°. 

The  tangency  chords  of  the  two  pairs  of  opposite  sides  of  a  bicentric 
quadrilateral  are  therefore  perpendicular  to  each  other. 

This  condition  is  also  sufficient:  A  bicentric  quadrilateral  PQRS  is  obtained 
if  the  tangents  PQ,  RS,  SP,  QR  are  drawn  through  the  end  points  X,  X',  Y,  Y'  oj 
two  perpendicular  chords  XX'  and  YY'  of  an  arbitrary  circle  T.  In  fact,  it  now 
follows  from  (1),  since  q  =  90°,  that  the  sum  of  the  opposite  angles  p  and  R  is 
180°,  i.e.,  that  PQRS  is  also  a  chord  quadrilateral. 

The  simplest  way  of  obtaining  the  desired  relation  between  the  radii  and  the 
axis  of  the  centers  of  the  circumscribed  and  inscribed  circles  is  by  means  of  the 
following  locus  problem.  A  right  angle  is  rotated  about  its  fixed  vertex ,  which  is 
located  inside  a  circle;  find  the  locus  of  the  point  of  intersection  of  the  two  circle 
tangents  that  pass  through  the  point  of  intersection  of  the  legs  of  the  angle  with 
the  circle. 

Solution  of  the  locus  problem.  Let  the  given  circle  be  known  as  T,  its 
midpoint  as  M ,  its  radius  as  p,  the  fixed  vertex  of  the  right  angle  as  O,  the 
distance  of  the  vertex  from  M  as  e.  Let  the  legs  of  the  right  angle  intersect  the 
circle  at  the  (moving)  points  X  and  Y;  and  let  the  point  of  intersection  of  the  two 
circle  tangents  passing  through  X  and  Y  be  known  as  P  and  its  distance  from  the 
center  of  the  circle  as  p. 


We  will  first  determine  the  relation  between  p  and  its  angle  (p  (=i 1  OMP )  with 
the  fixed  line  MO. 

Since  OXY  is  a  right  triangle, 


OF2  =  FXFY, 


where  F  represents  the  base  point  of  the  altitude  to  the  hypotenuse.  If  we 
introduce  the  projections  p'  =  MN  and  e'  =  e  cos  cp  and  p"  =  NX  and  e"  =  e  sin  cp 
(=  NF)  on  the  lines  MP  and  XY,  respectively,  the  equation  can  be  written 


(p'  -  oa  =  (p'  -  0(p'  +  o 

or 

2 P'2  -  2 pV  +  e'a  +  tmi  =  p'a  +  P'a 

or 

(2)  2p'2  —  2 p'e  cos  <p  +  e2  =  p2. 

Since  MXP  is  a  right  triangle, 

MX 2  =  MPMN 
or 

(3)  p2  =  pp. 

If  we  introduce  the  value  of//  from  (3)  into  (2),  we  obtain  the  relation  we  are 
looking  for: 

<4>  P,  +  2j£hi’axi’-jrir?- 

The  distance  r  =  ZP  of  a  point  Z  from  P  on  the  extension  of  OM  at  a  distance 
of  MZ  =  z  from  M  is  obtained  by  the  cosine  theorem 

(5)  r2  =  z2  +  p2  +  2  zp  cos  <p. 

If  for  z,  which  up  to  this  point  has  been  arbitrary,  we  now  choose  the  value 
(I)  MZ  -  z  =  -a  j-p-g, 

we  obtain,  in  accordance  with  (4), 

(,i, 


and  consequently  r  has  a  constant  value! 

The  desired  locus  of  the  point  of  intersection  P  is  thus  a  circle  (J  whose 
center  Z,  which  is  situated  on  the  extension  of  OM ,  is  determined  by  (I)  and 
whose  radius  r  is  determined  by  (II). 

Naturally,  also  belonging  to  this  locus  are  the  points  of  intersection  Q,  R,  S  of 
the  tangents,  which  are  obtained  when  we  draw  the  tangents  through  the  points 
of  intersection  of  the  circle  T  with  the  extensions  of  XO  and  YO. 

The  quadrilateral  PQRS  is  simultaneously  a  tangent  and  chord  quadrilateral, 
in  that  it  circumscribes  circle  T  and  is  inscribed  in  circle  <$  .  If  the  right  angle 
XOY  is  rotated  about  O  so  that  the  points  X,  Y  describe  the  circle  T,  the 
quadrilateral  PQRS  continuously  assumes  different  positions  but  always 
circumscribes  circle  T  and  is  always  inscribed  in  circle  $  .  Similarly,  we  see  that 
in  this  way  all  the  bicentric  quadrilaterals  belonging  to  the  two  circles  T  and  (j 
are  obtained.  The  obtained  formulas  (I)  and  (II)  contain  the  solution  to  the 
problem  posed. 

We  substitute  the  value  obtained  from  (II)  for  p 2  -  e1  in  (I)  and  obtain  e  = 
2zp 2/{r1  -  z2).  From  this  there  follows  p1  -  e2  =  /^[(r2  -  z2)2  -  4>o2z2]/(r2  -  z2)2. 
When  this  value  is  introduced  into  (II)  we  finally  obtain  the  sought-for  relation 
between  the  radii  r  and  p  and  the  axis  z  connecting  the  centers  of  the 
circumscribed  and  inscribed  circles  of  the  bicentric  quadrilateral: 

2pV  +  *a)  =  ('a  -  *a)a- 

The  developed  formula  comes  from  Nicolaus  Fuss  (1755-1826),  a  student 
and  friend  of  Leonhard  Euler.  Fuss  also  found  the  corresponding  formulas  for 
the  bicentric  pentagon,  hexagon,  heptagon,  and  octagon  (Nova  Acta  PetropoL, 
XIII,  1798). 

The  corresponding  formula  for  the  triangle  had  already  been  given  by  Euler. 
It  is 


ra  —  za  -  2 rp 

and  is  easily  obtained  in  the  following  manner.  Let  ABC  be  any  triangle,  let  Z 
and  M  be  the  respective  centers,  r  and  p  the  radii  of  the  circles  of  circumscription 
and  inscription,  respectively;  thus,  ZM  =  z  is  the  axis  connecting  the  centers; 
further,  let  D  be  the  point  at  which  the  extension  of  CM  meets  the  circumscribed 
circle,  so  that  DM  =  DA  =  DB.  The  power  of  the  circumscribed  circle  at  M  is 


MC  MD  =  ra  -  za. 


However,  since  we  can  replace  sin  (y/2)  by  the  ratio  p/MC  as  well  as  by  AD/lr  or 
MD/lr,  p/MC  =  MD/lr,  i.e., 


MC-MD  =  2  rp. 

When  the  two  values  found  for  the  product  MC  ■  MD  are  set  equal  to  each  other 
we  obtain  Euler’s  formula. 

Note.  Much  more  remarkable  than  the  Fuss  formula  is  a  theorem 
concerning  bicentric  quadrilaterals  that  follows  directly  from  the.  For 
convenience  in  expression  we  will  make  a  prefatory  observation. 

Fet  a  circle  T  lie  completely  inside  another  circle  <s  .  If  from  any  point  on  <s 
we  draw  a  tangent  to  T,  extend  the  tangent  line  so  that  it  intersects  a  ,  and  draw 
from  the  point  of  intersection  a  new  tangent  to  T,  extend  this  tangent  similarly  to 
intersect  (j  ,  and  continue  in  this  manner,  we  obtain  a  so-called  Poncelet  traverse 
which,  when  it  consists  of  n  chords  of  the  larger  circle,  is  called  n-sided. 

The  theorem  concerning  bicentric  quadrilaterals  now  reads: 

If  on  the  circle  of  circumscription  there  is  one  point  of  origin  for  which  a 
four-sided  Poncelet  traverse  is  closed ,  then  the  four-sided  traverse  will  also 
close  for  any  other  point  of  origin  on  the  circle. 

The  French  mathematician  Poncelet  (1788-1867)  demonstrated  that  this 
theorem  is  not  limited  to  four-sided  traverses  only,  but  is  generally  true  for  77- 
sided  traverses,  and  not  only  for  circles,  but  for  any  type  of  conic  section.  The 
general  theorem  reads: 

Poncelet’S  closure  theorem  :  If  an  n -sided  Poncelet  traverse  constructed 
for  two  given  conic  sections  is  closed  for  one  position  of  the  point  of  origin ,  it  is 
closed  for  any  position  of  the  point  of  origin. 
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Annex  to  a  Survey 


To  determine  the  position  of  unknown  but  accessible  points  of  the  earths 
surface  by  taking  the  bearings  of  known  points. 

(A  point  on  the  earth’s  surface  is  considered  as  known  when  its  geographic 
coordinates  [length  and  width]  are  known.) 

This  problem  is  of  great  importance  in  the  incorporation  of  new  points  of  the 
earth’s  surface  into  a  survey  and  consequently  in  the  preparation  of  accurate 
maps. 

Fand  surveyors  and  sailors  are  specifically  confronted  with  the  following  two 


cases: 


I.  The  Snellius-Pothenot  PROBLEM;  the  problem  of  three  inaccessible 
points  :  Determine  the  position  of  an  unknown  accessible  point  P  by  its  bearings 
from  three  inaccessible  known  points  A,  B,  C. 

This  most  famous  of  all  land  surveying  problems  was  posed  and  solved  by 
the  Dutchman  Willebrord  Snellius  (1581-1626)  in  his  1617  work,  Eratosthenes 
Batavus,  but  attracted  no  attention  among  his  contemporaries.  It  was  not 
commonly  known  until  it  was  solved  once  again  by  the  Frenchman  Pothenot 
(died  1732)  in  a  paper  submitted  in  1692  to  the  French  Academy.  Since  then  it 
has  been  known  as  the  Pothenot  problem. 

II.  HANSEN’S  PROBLEM;  THE  PROBLEM  OF  THE  INACCESSIBLE  DISTANCE  :  From  the 
position  of  two  known  but  inaccessible  points  A  and  B,  determine  the  position  oj 
two  unknown  accessible  points  P  and  P'  by  bearings  from  A,  B,  P'  to  P  and  A,  B, 
P  to  P'. 

This  problem  was  solved  by  the  German  astronomer  Hansen  (1795-1874), 
but  was  solved  as  well  by  other  authors  before  him. 

Trigonometric  Solution 

This  type  of  solution  is  required  when  accuracy  is  important,  as  in  land 
surveying.  For  both  problems  this  type  of  solution  is  based  upon  the  sine  tangent 
theorem: 

If 


sin  a/sin  fi  =  m/n , 


then  also 


a  ~  P  «  +  P 


tan  —  —-jinn  —  ^-r  —  (m  -  n)/(m  +  n). 


[From  sin  al sin  fi  =  m/n  it  first  follows  that 

(sin  a  —  sin  /3)/(sin  a  +  sin  /3)  =  (m  —  n)l(m  +  n). 

If  the  numerator  and  denominator  of  the  fraction  on  the  left  of  the  equation  are 
converted  into  products,  we  obtain 


cos 


o+/3.  a  —  pi.  a  +  ft  a  —  fi 

r*  ■  *  ami  ' 


sin 


J/  • 

■  /  sin 


cos 


=  (m  -  n)j(m  +  n) 


or 


a  -  B  I  a  +  B  ,  ...  .  , 

tan  — n  — ^  =>  (m  -  n)/(m  +  n).) 

Solution  of  the  Pothenot  Problem 

Known  are  the  five  elements  AC  =  a,  BC  =  b,  £ACB  =  y,  y^APC  =  a,  z,BPC 
=  /?;  to  be  found  are  the  five  elements  AP  =  x, 
BP  =y,  CP  =  z ,  4 CAP  =  0,  4CEP  =  y-  If  the  sine  theorem  is  applied  to  the 
triangles  A  CP  and  BCP, 


sin  0  _  z  ,  sin  9  _  z 
sin  a  a  an  sin  /3  6 

C 


FIG.  33. 


On  division  it  follows  from  this  that 

sin  0/sin  <p  =  b  sin  a/a  sin  j8. 

We  determine  the  auxiliary  angle  n  whose  tangent  is  b  sin  ala  sin  /?,  and  obtain 

sin  0/sin  <p  =  tan  /x. 


From  this  it  follows  according  to  the  sine  tangent  theorem  that 


tan 


~  ?> 


tan 


!  tan  fi  —  1 
0  +  tp  1  +  tan  fi 


=  tan  (fi  —  45°), 


tan  ^  ^  =  tan  ^  ^  ^ ■  tan  (/a  —  45°). 


i.e., 


Since  0  +  <p  ( =  360°  —  a  —  f3  —  y)  is  known,  this  equation  gives  us 

0  -  9 
2 


From 


0  +  y 
2 


and 


addition  and  subtraction  give  us  i//  and  cp. 

The  unknowns  x,  y,  z  are  obtained  from  the  following  formulas  derived  from 
the  sine  theorem: 

x  _  sin  (a  4-  0)  y  _  sin  ()3  +  y)  z  _  sin  ^ 
a  sin  a  6-  sin  a  sin  a 

The  position  of  the  point  P  is  determined  from  the  magnitudes  1//,  cp  x,  y,  z. 

Sloution  of  Hansen’s  Problem 

Known  are  the  five  elements  AB  =  c,  zlAPB  =  y,  zl APB  =  y',  y,BPP’  =  S,  2 1 
APT  =  5',  and  consequently  also  the  angles  PAP'  =  a  and  PBP'  =  fi  we  do  not 
know  the  seven  elements  AP  =  x,  AP'  =  x', 

BP  =  y,  BP'  rn  y\  &BAP'  =  0,  2iABP  =  9,  and  PP'  =  s. 

We  now  represent  the  four  ratios  of  the  adjacent  sides  of  the  quadrilateral  as  sine 
ratios  in  accordance  with  the  sine  theorem: 


c  _  sin  y  x  _  sin  8'  s  _  sin  /9  y'  _  sin  0 

x  sin  y  s  sin  a  y'  sin  8  c  sin  y 


Multiplication  of  these  equations  gives  us 

sin  0  sin  /9  sin  y  sin  8'  _  sin  0  _  sin  o  sin  y'  sin  8 

sin  y  sin  a  sin  y'  sin  8  sin  y  sin  /J  sin  y  sin  8' 

We  then  determine  an  auxiliary  angle  ju  whose  tangent  is  equal  to  the  right  side 
of  this  equation,  and  we  obtain 


sin  >fi 
sin  <p 


tan  fit 


i.e.,  according  to  the  sine  tangent  theorem  as  above, 

tan  LZI  =  un  ^  ^  _  45*). 

As  above,  we  find  from  this 

^2^  (since  ^>  +  9>  =  $  +  8'is  known) 

and  then  i//  and  cp.  Now  the  remaining  unknowns  are  easily  obtained  by  the  sine 
theorem. 


The  positions  of  P  and  P'  are  determined  by  the  values  found  for  the  six 
unknowns. 


The  Drawing  Solution 

This  is  adequate  when  great  accuracy  is  not  requisite,  for  example,  in  sailing 
along  a  coast  where  A,  B,  C  are  known  landmarks,  P  and  P'  unknown  positions 
of  a  ship  with  a  bearing  on  these  landmarks. 

The  solution  of  Pothenot’s  problem  is  extremely  simple.  The  ship’s  position 
P  is  the  point  of  intersection  of  the  two  circles  to  be  drawn  on  the  ship's  chart 
with  the  chords  AC  and  BC  and  the  corresponding  peripheral  angles  a  and  /?. 

Hansen’s  problem  is  solved  in  the  following  way.  We  draw  a  quadrilateral 
abp'p  having  the  same  form  as  ABPP  (beginning  with  an  arbitrary  distance  pp') 
and  lay  this  off  on  the  chart  so  that  b  falls  on  B  and  a  on  AB.  The  ship’s  position 


P  is  the  point  of  intersection  of  Bp  with  the  parallel  to  ap  passing  through  A,  the 
ship’s  position  P'  is  the  point  of  intersection  of  Bp'  with  the  parallel  to  pp' 
passing  through  P. 
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Alhazen’s  Billiard  Problem 


To  describe  in  a  given  circle  an  isosceles  triangle  whose  legs  pass  through 
two  given  points  inside  the  circle. 


This  problem  stems  from  the  Arabic  mathematician  Abu  Ali  al  Hassan  ibn  al 
Hassan  ibn  Alhaitham  (ca.  965  -  ca.  1039),  whose  name  was  transformed  into 
Alhazen  by  the  translators  of  his  Optics.  In  his  Optics  the  above  problem  has  the 
following  form:  “Find  the  point  on  a  spherical  concave  mirror  at  which  a  ray  oj 
light  coming  from  a  given  point  must  strike  in  order  to  be  reflected  to  another 
given  point”. 

This  problem  can  be  posed  in  various  other  forms,  e.g.:  “On  a  circular 
billiard  table  there  are  two  balls;  in  what  manner  must  one  be  struck  in  order  for 
it  to  strike  the  other  after  rebounding  from  the  cushion!”  or  “On  the 
circumference  of  a  circle  find  a  point  the  sum  of  whose  distances  from  two  given 
points  within  the  circle  is  equal  to  a  minimum  {or  maximum).” 

A  whole  series  of  famous  mathematicians  took  up  this  problem  after  Alhazen, 
among  them  Huygens,  Barrow,  de  L’Hopital,  Riccati,  and  Quetelet. 

Solution.  Let  us  call  the  given  circle  Jt,  its  center  M,  its  radius  r,  the  given 
points  P  and  p,  and  let  us  make  M  the  origin  of  a  mutually  perpendicular 
coordinate  system  xy  in  which  P  and p  have  the  coordinates  A\B  and  a\b. 

If  OS  and  Os,  which  pass  through  P  and  p,  are  the  legs  of  the  isosceles 
triangle  OSs  that  we  are  looking  for,  the  angles  O  and  tp,  which  these  legs  form 
with  the  radius  OM,  must  be  equal. 

If  we  designate  the  angles  that  the  lines  PO,  MO,  pO  form  with  the  x-axis  as 
A,  p,  X,  then,  on  the  one  hand,  ®  =  A  -  p  and  (p  =  p  -  X 
or 


tan  <t> 


tan  A  —  tan  a  . 

t - A  and 

1  +  tan  ft  tan  A 


tan  u  —  tan  A 

tan  w  =  - - — - -» 

1  +  tan  ft  tan  A 


while,  on  the  other  hand,  if  x\y  are  the  coordinates  of  O, 


tan  A 


tan  fi 


x 


tan  A 


x  —  a 


and  consequently,  since  tan  O  =  tan  cp 


y_ 

X 

1 


-  B 

-  A 


y 

X 


yy  -  B 
x  x  —  A 


x  x  —  a 


1  + 


XX  — 


b 

a 


or 

Ay  —  Bx  bx  —  ay 

x 2  +  y2  —  Ax  —  By  x3  +  y2  —  ax  —  by 

or  finally,  if  we  set 

Ab  +  Ba  =  H,  Aa  —  Bb  —  K,  A  +  a  =  h,  B  +  b  =  k, 


then 


//(jt2  -  y a)  -  2 Kxy  +  (x2  +  y2)[hy  -  kx ]  =  0. 

Since  the  point  0(x\y )  has  to  lie  upon  the  circle  it  ,  the  circle  equation 

(1)  x2+y2  =  r2 

consequently  applies  here,  and  our  condition  assumes  the  form 

(2)  Hix2  -  y2)  -  2 Kxy  +  r2[hy  -  kx]  =  0. 

Since  equation  (2)  represents  a  hyperbola,  our  conclusion  reads  as  follows: 

The  point  O  that  we  are  looking  for  is  the  point  of  intersection  of  the  circle(l) 
with  hyperbola  (2). 

Since  there  are  in  general  four  points  of  intersection  for  a  circle  and  a 
hyperbola,  there  are  in  general  four  solutions  to  our  problem. 

Possessing  particular  interest  is  the  special  case  in  which  the  distances  C  and 
c  of  the  given  points  P  and  p  from  the  center  M  are  equally  great.  In  this  case  we 
naturally  take  the  perpendicular  bisector  of  Pp  as  the  x-axis,  and  then  we  have 

A  =  a,  B  —  —b,  H  —  0,  K  =  c2,  h  =  2a,  k  =  0 
and,  according  to  (2) 


—  2 c2xy  -t-  2 ar2y  =  0. 


This  equation  is  satisfied  by  each  of  the  conditions 


(3)  y  =  0  and  (4)  x  =  a  ~ 

From  (3)  follows  the  corresponding  x  =  ±  r.  Consequently,  the  points  of 
intersection  of  St  with  the  x-axis  satisfy  the  condition  for  the  point  O  we  are 
looking  for. 

From  (4)  it  follows  that 


a  x 

If  we  then  draw  through  M  a  circle  f  whose  diameter  MN  =  d  =  c2la  lies  on 
the  x-axis,  and  if  Q(X\Y)  is  a  point  of  intersection  of  this  circle  with  jf,  it  follows, 
since  MNQ  is  a  right  triangle,  that 

MQ2  =  MN  X  or  r2  =  dX. 

However,  since  r2\x  =  d,  we  obtain 

X  m  X. 

Consequently,  the  points  of  intersection  of  the  circles  ff,  and  t  also  satisfy  the 
condition  for  the  point  O  we  are  looking  for. 


For  these  points  of  intersection  to  exist,  d  must  be  >  r  or  c2  >  ar.  We  will 
assume  that  this  condition  is  satisfied. 

Now  the  quadrilateral  MPpQ  in  circle  f  is  a  chord  quadrilateral,  and 
therefore,  according  to  Ptolemy’s  theorem,  the  sum  of  the  products  of  the 


opposite  sides  must  be  equal  to  the  product  of  the  diagonals: 


PQMp  +  pQ-MP  =  MQPp 


or 

(5)  {PQ  +  pQ)c  =  2  br. 

For  any  other  point  Q  of  Si,  MPpQ'  is  not  a  chord  quadrilateral,  and  therefore 
the  sum  of  the  products  of  the  opposite  sides  must  be  greater  than  the  product  of 
the  diagonals: 

(6)  {PQ'  +  pQ')c  >  2 br. 

From  (5)  and  (6)  we  obtain 


PQ  +  PQ  <  PQ'  +  PQ'. 

The  problem:  “ On  a  given  circle  find  a  point  the  sum  of  whose  distances  from 
two  given  points  located  in  the  circle  at  an  equal  distance  from  the  midpoint  of 
the  circle  is  a  minimum  ”  has  the  following  striking  solution: 

The  point  we  are  looking  for  is  the  point  of  intersection  of  the  given  circle 
with  the  circle  that  passes  through  the  given  points  and  the  center  of  the  given 
circle. 

Note.  In  connection  with  the  above  problem  Alhazen  also  solved  the 
problem:  “How  to  strike  a  ball  lying  on  a  circular  billiard  table  in  such  a  way 
that  after  twice  striking  the  cushion  the  ball  will  return  to  its  original  position 

Solution.  Let  the  billiard  table  possess  the  radius  r  and  the  center  M.  Let  the 
initial  position  of  the  ball  be  P,  so  that  MP  =  c  is  known.  Let  the  ball  first  strike 
the  circle  at  U,  cross  the  extension  of 


FIG.  36. 


PM  at  a  right  angle  at  F,  then  strike  the  circle  at  V  and  return  from  here  to  P.  UM 
and  VM  are  then  angle  bisectors  of  the  triangle  PUV.  We  set 

MF  =  x,  FU  =  y,  UP  =  Z. 

Applying  the  angle  bisector  theorem  to  the  triangle  FUP, 

y/z  =  x/ct 


and  according  to  the  Pythagorean  theorem 

r3  -  x3  +  y3  and  z3  =  y3  +  (x  +  c)3. 

If  we  eliminate  y  and  z  from  these  three  equations,  we  obtain  the  quadratic 
equation 


2  cx3  +  r3x  =  cr3 


for  the  unknown  x.  From  this,  x  is  easily  constructed. 


*  By  the  power  of  a  circle  at  a  point  is  meant  the  amount  by  which  thesquare  of  the  axis  to  the  point 
exceeds  the  square  of  the  radius  of  the  circle.  In  accordance  with  the  secant  or  chord  theorem  it  can  also  be 
represented  as  the  product  of  the  two  segments  originating  from  the  point  that  are  generated  by  the  circle 
through  the  point  on  any  secant. 

*  The  segment  ratio  AX:BX  is  considered  positive  if  X  is  situated  outside  A  B  and  negative  if  X  is  inside 
AB. 

*  D’Alembert  (1717-1783),  a  French  mathematician. 

*  Let  arc  Q\b  mean  the  circle  arc  whose  midpoint  is  Q  and  radius  b. 

*  The  proposal  that  this  number  be  designated  as  n  came  from  Leonhard  Euler  ( Commentarii  Academiae 
Petropolitanae  ad  annum  1739,  vol.  IX). 


Problems  Concerning  Conic  Sections  and  Cycloids 
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An  Ellipse  from  Conjugate  Radii 


To  draw  an  ellipse  for  which  the  magnitude  and  position  of  two  conjugate 
radii  are  given. 


Solution.  Let  the  ellipse  have  the  center  equation 


(1) 


=  1. 


Let  the  prescribed  conjugate  radii  be  OP  and  OQ  such  that  the  coordinates  x\y 
and  x'\y'  of  their  end  points  satisfy  the  conditions 


(The  conditions  (2)  give  us  directly  for  the  product  of  the  slopes  y\x  and  y'\x'  of 
the  two  radii  the  known  value  -  b2/a 2  for  the  product  of  the  slopes  of  the 
conjugate  radii.) 


FIG.  37. 

Let  the  base  point  of  the  ordinate  from  Q  be  V.  We  rotate  the  right  triangle 
OQV  clockwise  about  0  by  90°  to  the  position  Oqv  and  extend  the  straight  line 
Pq  to  intersect  with  the  axes  of  the  ellipse  at  H  and  K.  According  to  (2),  the 
distances  of  the  points  q  and  P  from  the  x-axis  and  the  distances  of  the  points  P 
and  q  from  the  y-axis  are  in  the  ratio  of  alb.  Consequently  (according  to  the  ray 
theorem), 


It  then  follows  from  this  that 


HP  +  Pq  Kq  +  qP 
HP  ~  Kq  ’ 


i.e.,  HP  =  Kq, 


so  that  the  center  M  of  Pq  is  also  the  center  of  HK. 

If  we  substitute  HP  for  Kq,  one  of  our  proportions  becomes 

(3)  KP/HP  =  ajb. 


In  order  to  obtain  a  second  equation  for  the  unknowns  KP  and  HP,  we  obtain 
the  cosine  and  sine  of  the  angle  v  from  HK  to  the  x-axis: 

cos  v  —  x/KP,  sin  v  =  y/HP; 
squaring  and  adding,  we  obtain 

(4)  +  -ftp  -  1. 

From  (1),  (3),  and  (4)  it  immediately  follows  that 

KP  -  a,  HP  =  b. 


This  gives  us  the  following  simple 

Construction.  1.  We  rotate  OQ  about  0  90°  through  the  interior  of  the 
obtuse  angle  POQ  to  the  position  Oq.  2.  We  determine  the  center  M  of  Pq  and 
the  points  of  intersection  H  and  K  of  the  line  Pq  with  the  circle  of  center  M  and 
radius  MO. 

KP  and  HP  are  then  equal  to  half  the  length  of  the  axes  of  the  ellipse,  while 
OH  and  OK  represent  the  positions  of  the  axes  of  the  ellipse. 

The  rest  is  simple. 
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An  Ellipse  in  a  Parallelogram 


To  inscribe  in  a  prescribed  parallelogram  an  ellipse  that  is  tangent  to  the 
parallelogram  at  a  boundary  point. 

The  solution  of  this  problem  is  based  upon  the  theorem:  Every  ellipse  can  be 
considered  as  a  normal  projection  of  a  circle. 

Let  ABCD  be  the  given  quadrilateral,  N  the  given  boundary  point  lying  on 


AB.  Let  the  other  points  at  which  the  ellipse  touches  the  boundary  of  the 
parallelogram  be  K  on  BC,  M  on  CD,  and  H  on  DA. 

In  the  normal  projection,  in  which  the  ellipse  has  the  image  of  a  circle,  the 
parallelogram  ABCD  and  the  tangency  points  N,  K,  M,  H  appear  as  projections 
of  a  parallelogram  circumscribing  a  circle,  and  specifically  of  a  rhombus  abed 
with  the  tangency  points  n,k,m,h. 

Since  nk\\hm\\ac  and  nh\\km\\bd  and  since  parallelism  is  preserved  in  a  normal 
projection,  NK\\HM\\AC  and  NH\\KM\\BD.  Thus,  we  find  the  tangency  points  H 
and  K,  respectively,  by  causing  the  parallels  through  N  to  BD  and  AC  to  intersect 
with  DA  and  BC,  respectively.  The  fourth  tangency  point  M  is  the  point  of 
intersection  of  CD  with  the  parallel  through  //to  AC. 

Let  the  centers  of  the  circle  and  ellipse  be  o  and  O,  respectively. 

We  will  now  assume  an  arbitrary  point  z  on  the  arc  nh  of  the  circle,  connect 
this  point  with  m  and  n,  and  designate  the  points  of  intersection  of  these 
connecting  lines  with  hk  and  da  as  x  and  y.  The  two  triangles  omx  and  any  are 
then  similar,  since  the  angles  at  o  and  a,  as  well  as  the  angles  at  m  and  n,  are 
equal  because  they  are  enclosed  between  pairs  of  orthogonal  legs.  From  this 
similarity  we  obtain  the  proportion 


ox/om  —  ayjan. 

If  we  substitute  oh  for  om  and  ah  for  an  in  this  proportion,  we  obtain 

ox/oh  =  ayjah. 

Let  the  normal  projections  of  the  points  x,  y,  z  be  X,  Y,  Z.  Since  the  ratio  of 
parallel  segments  is  not  altered  in  normal  projection,  we  have 

0X1  OH  =  AY  I  AH. 

The  points  X  and  Y  accordingly  divide  the  radius  of  the  ellipse  OH  and  the 
ellipse  tangent  AH  in  the  same  proportions. 

Quite  similar  proportions  are  naturally  found  to  obtain  for  the  other  ellipse 
arcs  MH,  MK,  NK. 

We  assign  the  tangents  AH,  BK,  DH,  CK  to  the  arcs  NH,  NK,  MH,  MK, 
respectively. 

In  summary  we  can  then  say: 

If  we  connect  a  point  of  one  of  the  four  arcs  with  M  and  N,  the  points  of 
intersection  of  these  connecting  lines  with  the  radius  {OH  or  OK)  and  the 


corresponding  tangents  divide  the  radius  and  tangents  in  the  same  proportions. 

This  gives  rise  to  the  following  elegant  construction. 

We  divide  the  radii  OH  and  OK  and  the  tangents  AH,  BK,  DH,  CK  each  into  v 
equal  segments  (eight  segments  are  shown  in  Figure  38)  and  number  the 
segments  from  1  to  v,  beginning  from  the  center  of  the  ellipse  with  the  radii  and 
at  the  corners  of  the  parallelogram  with  the  tangents.  We  then  connect  M  (N) 
with  an  arbitrary  segment  point  of  a  radius  and  N  ( M)  with  the  segment  point 
with  the  same  number  of  the  tangent  corresponding  to  the  arc  bounded  by  N  (M) 
and  the  end  point  of  the  radius.  The  point  of  intersection  of  the  two  connecting 
lines  is  in  each  case  a  point  on  the  ellipse. 


FIG.  38. 
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A  Parabola  from  Four  Tangents 


To  draw  a  parabola  four  tangents  to  which  are  given. 


The  simplest  solution  of  this  beautiful  problem  is  based  upon 

Lambert’s  Theorem  :  The  path  of  rotation  of  a  parabola  tangent  triangle 
passes  through  the  focus. 

(J.  H.  Lambert  (1728-1777)  was  a  German  mathematician.) 

In  order  to  prove  Lambert’s  theorem  we  need  the 

Theorem  of  similar  triangles  :  Two  tangents  SA  and  SB  to  a  parabola, 
together  with  the  lines  from  the  focus  to  the  contact  points  A  and  B  and  the  point 
of  intersection  S  of  the  tangents,  form  two  similar  triangles  FSA  and  FSB  such 
that  the  angle  of  the  one  triangle,  situated  at  the  point  of  tangency,  is  always 


equal  to  the  angle  of  the  other  triangle  that  is  situated  at  the  point  oj 
intersection. 

Proof.  In  accordance  with  the  classical  construction  of  the  parabola,  the 
mirror  images  H  and  K  of  the  focus  F  on  the  tangents  SA  and  SB,  respectively, 
fall  on  the  base  points  of  the  altitudes  dropped  from  A  and  B,  respectively,  on  the 
directrix  L. 


Since  the  angles  FAS  and  HAS  are  symmetrical,  and  the  angles  HAS  and 
FHK,  as  angles  between  pairs  of  orthogonal  legs,  are  equal,  it  follows  that 

TiFAS  =  2i  FHK 


and  likewise  that 


&FBS  =  &FKH. 

The  angles  FHK  and  FKH,  as  the  boundary  angles  opposite  the  chords  FK  and 
FH,  respectively,  on  the  circumference  of  rotation  of  the  triangle  FHK  (whose 
center  is  the  intersection  S  of  the  median  perpendiculars  SA  and  SB  of  the 
triangle)  are  half  as  great  as  the  corresponding  central  angle  and  consequently 
equal  to  angles  FSB  and  FSA,  respectively.  Consequently, 

2i FAS  =  2SFSB  and  &FBS  «  &FSA.  Q.E.D. 

Lambert’s  theorem  follows  directly  from  the  theorem  we  have  just  proved. 

In  fact:  If  P  and  Q  are  the  points  of  intersection  of  a  third  tangent  with  the 
tangents  SA  and  SB  that  touches  the  parabola  at  O,  then,  according  to  the 
theorem  of  similar  triangles, 


4 FAS  =  &FSB  and  &FAP  -  &FPO 


and  consequently 


4  fsq  =  4FPg. 

According  to  this  equation,  however,  the  quadrilateral  FPSQ  is  a  circle 
quadrilateral. 

Lambert’s  theorem  gives  us  directly  the  requisite  construction:  From  the  four 
tangent  triangles  that  can  be  formed  from  the  four  given  tangents,  we  choose  two 
and  draw  the  circumference  for  each.  The  point  of  intersection  of  the  two 
circumferences  is  the  focus.  We  then  find  the  mirror  image  of  the  focus  on  two 
tangents  and  in  this  way  obtain  two  points  of  the  directrix,  which  gives  us  the 
directrix.  The  rest  is  extremely  simple. 


Note.  The  theorem  of  the  circumference  of  the  tangent  triangle  leads 
directly  to  the  solution  of  the  interesting  problem: 

Determine  the  locus  of  the  foci  of  all  parabolas  that  are  tangent  to  three 
straight  lines. 

The  sought-for  locus  is  the  circumference  of  the  triangle  formed  from  the 
lines. 
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A  Parabola  from  Four  Points 


To  draw  a  parabola  that  passes  through  four  given  points. 


This  lovely  problem  was  first  solved  by  Newton  in  his  celebrated 
Philosophiae  naturalis  principia  mathematica,  1687,  and  then  once  again  in 


1707  in  his  Arithmetica  universalis. 

It  is  commonly  based  upon  the  auxiliary  problem: 

To  draw  a  parabola  for  which  three  points  and  direction  of  the  axis  are 
known. 

The  following  solution  of  the  auxiliary  problem  is  based  on  the  two 
theorems: 

I.  The  centers  of parallel  chords  of  a  parabola  lie  on  a  parallel  to  an  axis. 

II.  The  perpendicular  bisector  of  a  parabola  chord  and  the  perpendicular  to 
the  axis  through  the  center  of  the  chord  mark  off  the  half  parameter  on  the  axis. 

Proof.  The  equation  for  the  amplitude  of  a  parabola  is  commonly 
expressed  in  the  form  y 2  =  2 px.  If  x\y  and  X\  Y  are  the  end  points  of  a  parabola 
chord,  the  slope  of  the  chord  with  respect  to  the  x-axis  3  =  {Y-y)/{X- x).  From 

ya  =  2px  and  Y2  =  2 pX 
it  follows,  however,  by  subtraction  that 


=  2p(X  -  x),  i.e„  e-jhr-TTF 

If  we  call  the  ordinate  of  the  midpoint  of  the  chord  r/,  the  last  equation  can  be 
written  (because  2rj  =  Y  +  y)  in  the  form 

P 
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According  to  this  equation,  the  midpoints  of  all  chords  with  the  same  slope  3 
have  the  same  ordinate,  with  the  result  that  these  midpoints  lie  on  a  line  parallel 
to  the  axis  of  the  parabola,  and  thus  I.  is  proved. 

To  prove  II.,  we  take  note  of  the  fact  that  the  segment  marked  off  on  the  axis 
by  the  perpendicular  bisector  of  our  chords  and  the  perpendicular  to  the  axis 
through  the  chord  midpoint  is  equal  to  ;/$,  where  3  is  the  slope  of  the 
perpendicular  bisector  of  the  chord  with  respect  to  the  perpendicular  to  the  axis. 
However,  since  $  =  3,  the  length  of  the  segment  is  rj&  =  p,  which  was  to  be 
proved.  From  II.  it  also  follows  that:  If  the  midpoints  of  two  parabola  chords  lie 
on  a  perpendicular  to  the  axis ,  the  perpendicular  bisectors  of  the  chords 
intersect  on  the  axis. 

Let  A,  B,  C  be  the  given  parabola  points,  $  the  direction  of  the  axis.  Let  us 
draw  through  the  center  M  of  AB  a  parallel  to  the  axis,  through  the  center  N  of 
CA  the  perpendicular  to  the  axis,  and  call  their  point  of  intersection  M0.  Then 


according  to  I.,  M0  is  the  midpoint  of  the  parabola  chord  AqB0  that  passes 
through  M0  and  is  parallel  to  AB.  We  draw  the  perpendicular  bisectors  of  CA  and 
A0B0  (the  latter  as  a  perpendicular  dropped  from  M{)  to  AB).  According  to  II. , 
their  point  of  intersection  is  a  point  on  the  axis,  its  distance  from  the  base  point 
of  the  perpendicular  dropped  from  M0  or  N  is  the  half  parameter  p.  The  rest  is 
simple.  For  example,  making  use  of  the  subnormal  (p)  from  A,  we  draw  the 
nonnal  AU  and  the  tangent  AV  (both  being  drawn  to  the  axis).  The  midpoint  of 
UV  is  then  the  focus  and  the  mirror  image  of  the  focus  on  the  tangent  is  a  point 
on  the  directrix. 


The  solution  of  Newton’s  parabola  problem  is  based  upon  the  following 
auxiliary  theorem:  In  all  parabola  quadrilaterals  the  products  of  the  diagonal 
segments  are  proportional  to  the  squares  of  the  segments  on  the  diagonals  that 
are  bounded  by  their  point  of  intersection  and  the  axis  of  the  parabola. 

Proof.  Let  AB  be  an  arbitrary  parabola  chord,  let  M  be  its  midpoint,  U  the 
point  of  intersection  of  the  parallel  to  the  parabola  axis  through  M.  If  we  select 
UM  as  the  x-axis  and  the  parabola  tangent  through  U  as  the  v-axis,  we  obtain  the 
usual  parabola  equation  in  the  form 


y 2  =  4 kx. 


where  k  is  the  focal  radius  of  the  coordinate  origin  U.  The  coefficient  4k 
possesses  the  value  2/?/sin2  k,  where  2 p  is  the  parameter  and  k  the  angle  enclosed 
between  the  coordinate  axes  or  the  angle  formed  by  the  chord  AB  with  the  axis 
of  the  parabola. 

We  select  an  arbitrary  point  O  on  AB  and  designate  the  point  of  intersection 
of  the  parallel  to  the  x-axis  through  0  with  the  parabola  as  Q,  the  coordinates  of 
Q  as  x  and  y,  and  the  coordinates  of  A  as  X  and  Y,  so  that 

QO  =  q  —  X  —  x,  OA  =*  Y  —  y,  OB  -  Y  +  y. 


From 


Y2  **  4 kX  and  y2  —  Akx 

it  follows  by  subtraction  that 

Y2  -  y2  -  4 k(X  -  x) 


or 


(y  +  y)(K-y)  =4*(*-x). 


so  that 


(1) 


OA-OB  =*  4 kq. 


If  A'B'  is  a  second  parabola  chord  through  0,  then  accordingly 
(2)  OA'-OB'  «  U'qt 

with  4k'  =  2/>/sin2  k',  where  k'  is  the  angle  of  the  chord  A'B'  with  the  parabola 
axis. 

Division  of  (1)  and  (2)  gives 

OA  ■  OB/OA'  ■  OB’  =  k/k'  =  sin3  ic'/sin3  k. 


If  H  and  H'  are  the  points  of  intersection  of  the  chords  AB  and  A'B'  with  the 
parabola  axis,  it  follows  from  the  sine  theorem  that 

OH/OH'  =  sin  *'/sin  *. 

From  the  last  two  equations  we  finally  obtain 

OA- OB/OA'  OB'  =  0H2/0H'2.  Q.E.D. 

With  this  theorem  we  can  now  obtain  the  following  solution  to  Newton’s 
problem:  Let  A,  B,  C,  D  be  the  given  points.  We  draw  the  diagonals  AC  and  BD 
of  the  quadrilateral  ABCD  and  call  their  point  of  intersection  O.  On  the  diagonals 
we  mark  off  from  O  the  mean  proportionals  OP  =  COA  OC  and  OQ  =  VOB-OD.- 
The  connecting  line  QP,  according  to  the  theorem  we  have  just  proved,  is  then 
parallel  to  the  parabola  axis,  and  the  problem  now  reduces  to  the  auxiliary 
problem  treated  above. 

The  following  projective  solution  of  Newton  s  problem  also  consists  of  the 
reduction  of  the  problem  to  the  preceding  auxiliary  problem.  This  transformation 
of  the  problem  is  accomplished  by  means  of  Desargues’  involution  theorem  (No. 
63).  According  to  this  theorem,  every  tangent  to  a  parabola  cuts  the  opposite 
sides  of  an  inscribed  quadrilateral  in  point  pairs  of  an  involution  in  which  the 
point  of  tangency  of  the  tangent  is  a  double  point. 

As  tangent  T  let  us  choose  a  very  distant  one.  Let  it  be  tangent  to  the  parabola 
at  O  and  let  it  be  cut  at  P,  Q,  P' ,  and  Q'  by  the  lines  AB,  BC,  CD,  DA  connecting 
the  four  given  parabola  points.  O  is  then  the  double  point  of  the  involution 
determined  by  the  pairs  (P,  Pj  and  (Q,  Qj.  Similarly,  the  rays  drawn  from  an 
arbitrary  point  Z  of  the  picture  plane  to  P,  Q,  P',  Q',  O  form  an  involution  with 
the  ray  pairs  ( ZP ,  ZPj  and  (ZQ,  ZQj  and  the  double  ray  ZO.  Because  of  the  very 
great  distances  of  the  points  P,  Q,  P' ,  Q',  O  the  rays  ZP,  ZQ,  ZP',  ZQ  on  the 
drawing  paper  run  parallel  to  the  quadrilateral  sides  AB,  BC,  CD,  DA,  and  the 


ray  ZO  here  runs  parallel  to  the  axis  of  the  parabola.  (The  slope  (y  -  b)/(x  -  a)  = 
(V2px  -  b)j(x  -  a]  of  the  line  connecting  points  Z(a\b)  and  0{x\y),  because  of  the 
great  value  of  x,  is  essentially  equal  to  zero,  so  that  the  ray  ZO  appears  parallel  to 
the  axis  on  the  drawing  paper.) 

Accordingly  we  obtain  the  following  construction.  We  draw  through  an 
arbitrary  point  Z  of  the  paper  the  parallels  p,  q,  p',  q'  to  the  lines  AB,  BC,  CD, 
and  DA  and  construct  a  double  ray  of  the  involution  determined  by  the  ray  pairs 
( p ,  p')  and  (q,  q');  this  ray  has  the  direction  of  the  parabola  axis.  Thus,  the 
problem  is  reduced  to  the  auxiliary  problem  solved  above. 

Since  in  ray  involution  there  are  in  general  two  double  rays,  there  are  in 
general  two  parabolas  that  can  be  drawn  through  four  given  points. 
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A  Hyperbola  from  Four  Points 


To  draw  a  right-angle  {equilateral)  hyperbola  for  which  four  points  are 
given. 


The  construction  is  based  upon  the  auxiliary  theorem:  The  Feuerbach  circle 
of  a  triangle  inscribed  in  an  equilateral  hyperbola  passes  through  the  center  oj 
the  hyperbola. 


Proof.  Let  ABC  be  a  triangle  inscribed  in  an  equilateral  hyperbola  with  the 
center  at  Z  and  the  asymptotes  1  and  11;  let  A',  B',  C  be  the  midpoints  of  the  sides 
BC,  CA,  AB,  and  let  Ax  and  A2  be  the  points  of  intersection  of  BC  with  1  and  11, 
and  Bx  and  B2  the  points  of  intersection  of  CA  with  1  and  11. 


Since  the  asymptotes  mark  off  equal  segments  on  the  extensions  of  a 


hyperbola  chord,  BA2  =  CA1  and  CB2  =  ABX  and  A'  is  the  midpoint  ofA1A2  and  B 
'  the  midpoint  of  BXB2.  These  midpoints  are  also  the  midpoints  of  the 
circumferences  of  rotation  of  the  right  triangles  A  \ZA2  and  BXZB2 ,  so  that 

2iA'ZA1  =  &A'AXZ  and  * B’ZBl  =  ZB'BXZ. 

Since  the  difference  of  the  left  sides  of  these  equations  represents  angle  A' ZB' 
and  the  difference  of  the  right  sides  angle  AfCBx  (according  to  the  theorem  of 
external  angles),  both  of  these  angles  are  equal  or  angles  A’ZB'  and  A'CB'  are 
supplementary.  However,  since  the  angles  of  the  parallelogram  CA'C'B'  at  C  and 
C'  are  equal,  angles  A' ZB'  and  A'C'B'  are  also  supplementary.  The  quadrilateral 
ZA'C'B'  is  therefore  a  circle  quadrilateral.  In  other  words:  the  circumference  of 
rotation  of  the  triangle  A'B'C',  i.e.,  the  Feuerbach  circle  of  the  triangle  ABC  (see 
No.  28),  passes  through  the  center  of  the  hyperbola.  Q.E.D. 

Construction.  Let  the  four  given  points  be  A,  B,  C,  D.  We  draw  the 
Feuerbach  circle  of  the  triangles  ABC  and  ABD;  the  point  of  their  intersection  Z 
is  the  center  of  the  hyperbola.  We  connect  Z  to  the  midpoint  A'  of  BC,  draw  the 
circle  A'\A'Z  and  at  its  points  of  intersection  A x  and  A2  with  the  line  BC  we  have 

two  points  of  the  asymptotes  1  and  11,  which  gives  us  the  asymptotes.  The  rest  is 
easy.  (To  draw  the  hyperbola  from  points,  for  example,  we  pass  an  arbitrary  line 
through  one  of  the  given  points,  for  example  A,  and  mark  off  on  this  line  the 
segment  between  A  and  1  from  II  to  A;  the  point  at  the  end  of  the  marked-off 
segment  is  a  new  point  of  the  hyperbola.  Repetition  of  the  construction  with  new 
lines  through  A  gives  us  as  many  points  of  the  hyperbola  as  desired.) 

Note.  The  proved  auxiliary  theorem  immediately  gives,  as  well,  the 
solution  to  the  interesting 

Locus  problem:  Find  the  locus  of  the  centers  of  all  equilateral  hyperbolas 
that  can  be  circumscribed  about  a  given  triangle. 

The  locus  is  the  Feuerbach  circle  of  the  given  triangle. 
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Van  Schooten’s  Locus  Problem 


Two  vertexes  of  a  rigid  triangle  in  a  plane  slide  along  the  arms  of  an  angle  oj 
the  plane;  what  locus  does  the  third  vertex  describe? 


Franciscus  van  Schooten  (the  younger)  (1615-1660),  a  Dutch  mathematician, 
treated  this  beautiful  problem  in  his  Exercitationes  mathematicae,  which 


appeared  in  1657. 

Solution.  We  will  first  consider  a  special  case  of  van  Schooten’s  problem, 
the  solution  to  which  had  already  been  taught  by  the  Byzantine  Proclus  (410- 
485). 

On  a  rigid  line  three  points  are  marked;  two  of  these  slide  along  the  arms  of  a 
right  angle;  what  locus  does  the  third  describe? 

We  select  the  arms  1  and  11  of  the  right  angle  as  the  x-  and  y-axes  of  a 
coordinate  system.  Let  the  three  marked  points  of  the  rigid  line  be  A,  B ,  C,  their 
mutual  distances  BC  =  a,  CA  =  b ,  and  AB  =  c.  Then  c  =  a  ±  b,  accordingly  as  C 
does  or  does  not  lie  between  A  and  B.  Let  the  point  A  slide  on  I  and  B  on  11.  Let 
the  marked  point  C  possess  the  coordinates  x  and  y.  Let  the  angle  of  the  line  with 
respect  to  the  x-axis  be  v;  thus  x,  as  the  projection  from  a  on  1,  is  equal  to  a  cos 
v;  y,  as  the  projection  of  b  on  11,  is  equal  to  b  sin  v;  and  consequently,  x2  =  a2 
cos2  v,  y1  =  b2  sin2  v,  and 


The  locus  of  the  marked  point  C  is  thus  an  ellipse  with  the  half  axes  a  and  b. 
This  locus  property  is  the  basis  of  the  so-called  paper  strip  construction  of  the 
ellipse  and  trammel. 

Paper  Strip  Construction  of  the  Ellipse 


On  the  sharp  edge  of  a  paper  strip  we  mark  off  the  three  points  in  the 
sequence  B,  A,  C  in  such  manner  that  BC  =  a  and  AC  =  b  (<  a)  are  equal  to  the 
given  half  axes  of  an  ellipse.  We  move  the  strips  in  such  manner  that  A  always 
remains  on  the  x-axis  and  B  on  the  y-axis  and  we  constantly  mark  the  place  at 
which  C  is  situated.  The  locus  described  by  the  point  C  is  an  ellipse  with  the 
prescribed  half  axes  a  and  b. 


The  Trammel 

A  trammel  consists  of  a  cross  with  two  grooves  at  right  angles  to  each  other 
in  which  two  sliding  pins  A  and  B  move.  The  pins  are  fixed  to  a  beam  to  which 
at  some  point  a  movable  pencil  M  can  be  attached.  When  the  pins  slide  in  the 
grooves  the  pencil  describes  an  ellipse  with  the  half  axes  AM  and  BM. 

Now  for  the  general  van  Schooten  problem! 

Let  S  be  the  apex  of  the  fixed  angle  a  along  the  arms  of  which  the  vertexes  A 


and  B  of  the  rigid  triangle  ABC  slide.  We  draw  the  circle  ft  with  AB  as  chord  and 
(T  as  peripheral  angle,  join  its  midpoint  M  with  C  and  determine  the  points  of 
intersection  P  and  Q  of  this  connecting  line  with  ft.  Let  us  consider  this  circle 
along  with  points  P  and  Q  as  being  firmly  connected  to  the  rigid  triangle,  so  that 
it  also  participates  in  the  motion  of  the  triangle.  Consequently,  since  a  is  the 
peripheral  angle  opposite  AB,  it  passes  continuously  through  S.  The  arcs  AP  and 
AQ  continuously  change  their  position  but  not  their  magnitude!  This  entails  the 
invariance  of  the  peripheral  angles  ASP  and  ASQ,  which  implies  the  invariance 
of  the  directions  1  and  11  that  are  determined  by  SP  and  SQ.  Since  PQ  is  a 
diameter  of  ft,  1  and  11  are  perpendicular  to  each  other.  We  can  therefore  consider 
the  motion  of  the  vertex  C  as  the  motion  of  the  marked  point  C  of  a  rigid  line 
PQC  the  other  marked  points  of  which  P  and  Q  slide  along  the  arms  1  and  11  of  a 
right  angle.  According  to  the  above  special  case,  C  describes  an  ellipse. 


Result  :  van  Schooten’S  theorem  :  The  locus  of  one  corner  of  a  three- 
cornered  plate  the  other  two  corners  of  which  slide  along  the  arms  of  a  fixed 
angle  is  an  ellipse. 

The  above  derivation  also  gives  the  magnitudes  and  position  of  the  ellipse. 
The  axes  of  the  ellipse  have  the  positions  1  and  11  and  the  magnitudes  2  CP  and 
2  CQ. 
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Cardan’s  Spur  Wheel  Problem 


What  is  the  locus  described  by  a  marked  point  on  a  circular  disc  that  rolls 


along  the  inner  edge  of  a  disc  of  double  its  radius? 

Jerome  Cardan,  an  Italian  mathematician  (1501-1576),  is  known  for  the 
Cardan  formula  for  solution  of  cubic  equations. 

Solution.  Let  the  boundary  of  the  large  disc  be  ft  and  that  of  the  smaller 
disc  f,  and  let  their  radii  be  equal  to  R  =  2r  and  r,  respectively.  First  we  will 
observe  the  motion  of  the  marked  disc  diameter  AB,  which  we  give  the  mark  M. 
At  the  beginning  of  the  motion  let  A  lie  at  the  midpoint  O  and  B  at  the  boundary 
point  H  on  ft.  When  the  circle  f  is  rolled  forward  within  ft  by  the  arc  HT ,  let  it 
cut  the  radius  OH  at  X,  and  let  y  be  the  point  at  which  it  cuts  the  radius  OK  of  ft, 
which  is  perpendicular  to  OH.  Since  the  angle  XOY  is  90°,  XY  is  a  diameter  of  f , 
and  the  intersection  S  of  XY  with  0T  is  the  center  of  f .  If  w  is  a  peripheral  angle 
XOT  of  f  in  radian  measure,  then  the  corresponding  central  angle  XST  is  2w  and 
the  arc  XT  is  2rw.  However,  since  w  also  represents  the  central  angle  HOT  of  ft, 
the  arc  HT  =  Rw  =  2 rw.  The  arc  XT  of  the  smaller  circle  is  exactly  as  long  as  the 
arc  HT  of  the  larger  circle  upon  which  the  small  circle  is  rolled  forward.  X  must 
therefore  be  the  end  B  of  the  marked  diameter  AB,  consequently  Y  is  the  other 
end  A  of  this  diameter.  The  rotation  of  a  disc  along  the  inner  margin  of  a  disc  oj 
double  its  width  consequently  means  that  the  end  points  of  a  marked  diameter  oj 
the  smaller  circle  slide  along  two  fixed  orthogonal  diameters  of  the  larger  circle. 
The  locus  of  our  marked  point  M  is  therefore  also  the  locus  of  the  mark  M  of  the 
diameter  AB  whose  end  points  A  and  B  slide  along  the  arms  OK  and  OH  of  the 
right  angle  HOK.  In  view  of  the  paper  strip  construction  of  the  ellipse  (No.  47), 
the  locus  we  are  seeking  is  thus  an  ellipse. 

The  half  axes  of  this  ellipse  are  MA  and  MB. 


FIG.  45. 


NOTE.  Since  a  marked  point  on  the  boundary  of  the  smaller  disc  describes  a 
diameter  of  the  larger  disc,  a  gear  consisting  of  two  spur  wheels  the  ratio  of 
whose  diameters  is  as  2:1  effects  the  conversion  of  a  circular  motion  into  a 
reciprocal  rectilinear  motion. 
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Newton’s  Ellipse  Problem 


To  determine  the  locus  of  the  centers  of  all  ellipses  that  can  be  inscribed  in  a 
given  ( convex )  quadrilateral. 

Newton’s  very  elegant  solution  to  this  problem  is  based  upon  the  theorem, 
also  stemming  from  Newton: 

The  line  connecting  the  centers  of  the  diagonals  of  a  quadrilateral 
circumscribed  about  a  circle  passes  through  the  center  of  the  circle. 

The  proof  of  this  property  of  a  tangent  quadrilateral  is  based  upon  the 
following  auxiliary  theorem:  The  locus  of  the  common  vertex  of  two  triangles 
with  prescribed  base  lines  and  a  prescribed  area  sum  is  a  straight  line. 

[Proof:  Let / and  g  be  the  two  prescribed  base  lines,  x  and  y  the  distances  of 
the  common  vertex  S  of  the  two  triangles  from  the  prescribed  base  lines  and,  at 
the  same  time,  the  “coordinates”  of  the  point  S.  The  prescribed  sum  of  the  areas 
of  the  two  triangles  we  will  call  K.  Since  the  triangles  have  the  area  Vfx  and  \gy, 
we  obtain  the  equation  fie  +  gy  =  2  K,  and  this  is  the  equation  of  a  straight  line.] 

Let  there  be  circumscribed  about  a  circle  of  center  O  and  radius  r  the  tangent 
quadrilateral  ABCD  with  the  sides  AB  =  a,  BC  =  b,  CD  =  c,  DA  =  d,  so  that  a  +  c 
=  b  +  d.  Let  M  be  the  midpoint  of  the  diagonal  AC  and  N  the  midpoint  of  BD,  2 J 
the  area  of  the  quadrilateral.  Since  AMAB  and  A MCD  have  areas  equal  to  one 
half  A  CAB  and  A ACD,  respectively,  the  sum  of  the  areas  of  the  two  triangles 
MAB  and  MCD  is  equal  to  J ,  or  half  the  area  of  the  quadrilateral.  Consequently, 
the  line  MN  is  the  locus  of  the  common  vertex  S  of  all  the  pairs  of  triangles 
(SAB,  SCD )  having  the  area  J.  However,  since  the  two  triangles  OAB  and  OCD 
also  have  the  area  sum  J  (specifically, 


A 


FIG.  46. 


I  =  OAB  +  OCD  =  r  ^Lf  and  II  =  OBC  +  ODA  =  r  b—j^ 

and  I  =  II.  From  I  +  II  =  2J  it  then  follows  that  I  =  II  =  J),  thus  O  belongs  to  the 
locus.  Q.E.D. 

Now  for  the  solution  to  Newton’s  problem! 

Let  us  consider  any  ellipse  inscribed  in  the  given  quadrilateral  as  the  normal 
projection  of  a  circle.  In  this  reflection  the  quadrilateral  appears  as  the  image 
(the  normal  projection)  of  an  object  quadrilateral  circumscribed  about  the  circle. 
Now,  since:  1.  in  the  object  the  center  of  the  circle  lies  upon  the  line  connecting 
the  midpoints  of  the  diagonals;  2.  halving  is  preserved  in  the  normal  projection; 
3.  the  center  of  the  ellipse  is  the  image  of  the  center  of  the  circle,  then  in  the 
image  also  the  ellipse  center  lies  on  the  line  joining  the  midpoints  of  the 
diagonals  of  the  prescribed  quadrilateral. 

Conclusion:  The  locus  of  the  centers  of  all  the  ellipses  that  can  be  inscribed 
in  a  given  quadrilateral  is  a  straight  line ,  specially ,  the  line  connecting  the 
midpoints  of  the  diagonals  of  the  quadrilateral. 
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The  Poncelet-Brianchon  Hyperbola  Problem 


To  determine  the  locus  of  the  intersection  of  the  altitudes  of  all  the  triangles 
that  can  be  inscribed  in  a  right-angle  (equilateral)  hyperbola. 


Brianchon  (1785-1864)  and  Poncelet  (1788-1867)  were  French 
mathematicians.  The  solution  is  in  vol.  XI  of  the  Amales  de  Gergonne  (1820- 


1821). 

We  relate  the  hyperbola  to  its  asymptotes,  which  will  serve  as  coordinate  axes 
(the  x-axis  and  <f-axis),  and  take  the  abscissa  (ordinate)  of  the  apex  of  the 
hyperbola  as  the  unit  length.  The  equation  for  the  hyperbola  then  reads 

x£  =  1. 

Let  PQR  be  an  arbitrary  triangle  inscribed  in  the  hyperbola,  i.e.,  a  triangle 
whose  vertexes  P,  Q,  R  lie  on  the  hyperbola,  et  the  abscissas  of  the  points  P,  Q, 
R  be  a ,  b,  c,  the  ordinates  thus  being  a  =  \/a,  fi=  \/b,y=  \lc. 

The  slope  of  the  side  QR  is  (fi  -  y)/{b  -  c)  or,  if  we  substitute  1  lb  and  He  for  f 
and  y,  -  Hbc.  The  slope  of  the  altitude  to  QR  is  thus  be. 

The  equation  of  this  altitude  is  thus  £-a  =  bc(x  -  a )  or 

(1)  £  +  abc  =  bc(x  +  a/3 y). 

For  the  altitude  passing  through  Q  we  obtain  similarly 

(2)  £  +  abc  =  ca(x  +  a/3 y). 

Now,  if  the  coordinates  of  the  altitude  intersection  are  understood  to  be  x|£, 
(1)  and  (2)  both  apply,  and  by  equalizing  the  right  sides  we  find  the  abscissa  x  of 
the  point  of  intersection  of  the  altitudes: 

(I)  *  =  -a/Sy. 

If  we  introduce  this  value  into  (1)  or  (2),  we  obtain  as  the  ordinate  of  the  altitude 
intersection 

(II)  £  =  -abc. 

Multiplying  (I)  and  (II)  finally  gives  us 

x£  =  1. 

The  altitude  intersection  thus  lies  on  the  hyperbola.  Consequently: 

The  locus  of  the  point  of  intersection  of  the  altitudes  of  all  the  triangles  that 
can  be  inscribed  in  an  equilateral  hyperbola  is  the  hyperbola  itself 
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A  Parabola  as  Envelope 


On  one  arm  of  an  angle  the  arbitrary  segment  e  and,  on  the  other,  the  segment 
f  are  marked  off  n  times  in  succession  from  the  vertex  of  the  angle,  and  the 
segment  end  points  are  numbered,  beginning  from  the  vertex,  0,  1,  2,  . ..,  n  and 
n,  n  -  1,  . . .,  2,  1,0,  respectively. 

Prove  that  the  lines  joining  the  points  with  the  same  number  envelop  a 
parabola. 

The  proof  is  based  upon  the 

Theorem  of  Apollonius:  TwO  tangents  to  a  parabola  are  divided  into 
segments  of  like  proportion  by  a  third  and  this  third  is  divided  in  the  same 
proportion  by  its  point  of  tangency. 

More  precisely:  If  the  two  parabola  tangents  SA  and  SB,  with  the  points  of 
tangency  A  and  B,  are  intersected  by  a  third  parabola  tangent  at  P  and  Q,  and  if 
O  is  the  point  of  tangency  of  this  third  tangent  (Figure  40),  we  obtain  the 
equation 


SP_  _OQ  _BQ 
PA  ~  OP  ~  SQ' 

The  proof  of  the  Apollonian  theorem  is  based  upon  the  known  parabola 
property:  The  point  of  intersection  of  two  parabola  tangents  lies  on  a  parallel  to 
the  parabola  axis,  passing  through  the  midpoint  of  the  chord  connecting  the 
points  of  tangency.  (It  follows  directly  from  the  situation  that  the  three  median 
perpendiculars  of  the  triangle  FA'B'  whose  vertexes  are  the  focus  F  and  the 
projections  A'  and  B'  of  the  points  of  tangency  A  and  B  on  the  directrix  pass 
through  a  single  point.  Two  median  perpendiculars  are  the  tangents  and  the  third 
is  the  parallel  to  the  axis.) 

Because  of  this  property 

(1)  p'  =  a',  (2)  f  =  b\  (3)  b'  +  fi '  =  a'  +  «', 

if  we  call  the  projections  of  the  segments  AP  =  a,  PS  =  a,  BQ  =  b,  QS  =  f,  OP  = 
p,  OQ  =  q  on  the  directrix,  a',  a',  b' ....  Moreover,  as  a  result  of  the  equality  of 
the  projections  of  the  segment  PQ  and  the  traverse  PSQ, 

(4)  +  p. 

If,  in  accordance  with  (1)  and  (2),  we  substitute  a'  and  b'  for  p'  and  q'  in  (4),  we 
obtain 


«'  +  P  -  «'  +  b\ 


and  this  equation  which  combined  with  (3)  shows  that 


«'  =  b'  and  p'  =  a'. 


FIG.  47. 


This  now  gives  us 


a/a  =  a'/a'  -  A'/a'' 
g/p  =  g'lp'  -  b'la'  L 

bIP  =  b'lP  =  A'/a'J 

which  proves  the  theorem  of  Apollonius. 

The  execution  of  the  envelope  construction  described  above  is  now  very 
simple.  Let  us  call  the  apex  angle  S;  we  then  select  on  the  arms  of  the  angle  the 
points  A  and  B  in  such  manner  that  SA  =  ne  and  SB  =  nf  (A  and  B  are  the  same 
points  that  received  the  numbers  n  and  O  in  the  numbering  process  previously 


described),  and  consider  the  parabola  that  is  tangent  to  the  arms  of  the  angle  at 
A  and  B.  According  to  Apollonius’  theorem,  the  line  connecting  the  point  P  on 
SA  to  which  the  number  v  has  been  assigned  with  the  point  Q  on  SB  is  tangent 
to  the  parabola.  [The  ratios  PS  :  PA  and  QB  :  QS  are  both  equal  to  v  :  n  -  v.] 
Consequently,  the  parabola  is  enveloped  by  the  lines  joining  the  points  with  the 
same  numbers. 

At  the  same  time,  Apollonius’  theorem  makes  it  possible  to  draw  the 
tangency  point  for  each  connecting  line. 
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The  Astroid 


To  find  the  envelope  of  a  straight  line,  two  marked  points  on  which  slide 
along  two  fixed,  mutually  perpendicular  axes. 


Gottfried  Wilhelm  Leibniz  (1646-1716),  the  inventor  of  infinitesimal 
calculus,  founded  the  theory  of  envelopes  in  1692  in  his  paper  De  linea  ex  lineis 
numero  infinitis  ordinatim  ductis  inter  se  concurrentibus  easque  omnes  tangente. 

Solution.  We  seek  the  equation  of  the  envelope  in  the  coordinate  system  in 
which  the  two  given  axes  are  the  x-axis  and  v-axis  and  their  intersection  O  is  the 
origin. 

Let  the  constant  distance  between  the  designated  points  be  represented  by  /. 
Let  AB  and  A'B'  represent  two  positions  of  the  marked  off  distance  /,  M  and  N 
the  midpoints  of  AA'  and  BB',  OM  =  a,  ON  =  b,  AA'  =  2 a,  BB'  =  If,  thus  OA  =  a 
+  a,  OA'  =  a  -  a,  OB  =  b-f,  OB'  =  b  +  f.  The  conditions  AB  =  l  and  A'B'  =  l  can 
then  be  written 


(1)  (a  +  a)2  +  {b-  p)2  =  l2  and  (a  -  a)2  +  {b  +  P)2  =  l2, 


from  which  we  obtain  by  subtraction 
(2)  a«  =  bp. 


The  point  of  intersection  S(x,  y)  of  the  two  straight  lines  AB  and  A'B'  is 
expressed  by  the  two  equations 


+ 


a  +  a  b  —  P 


=  1  and 


a  —  a 


b  +  P 


=  1, 


and  the  following  two  equations: 


which  are  obtained  from  the  first  two  by  addition  and  subtraction.  If  we  then 
divide  (4)  by  (2),  we  obtain 


*  _  y 

a  (a2  -  a3)  b(b2  -  /S3) 


and  with  the  use  of  (3), 


(5) 


y 


=  b 


b2  -  fl3 
o3  +  b2 


If  we  then  allow  A  and  A'  and  B  and  B'  to  approach  each  other  (naturally 
maintaining  the  conditions  AB  =  /  and  A'B'  =  /),  then  a  and  /?  become 
continuously  smaller  and  the  point  of  intersection  S  of  the  lines  AB  and  A'B' 
comes  closer  and  closer  to  the  envelope,  finally  reaching  it  when  a  and  /?  are 
equal  to  zero.  The  point  x\y  at  which  the  envelope  is  reached  is  then  represented, 
according  to  (5),  by  the  equations 

a3  b 3 

(5  }  “  a3  +  b2'  y  ~  a2  +  b2’ 


in  which,  in  view  of  (1), 

(!')  a2  +  b2  =  l 3 


is  true. 

From  (5')  it  then  follows  that 

a3  =  l2xt  b3  =«  l2y  or  a 2  =  Z***4,  b2  — 


from  which 


Za  =  +  l*yX 


is  obtained  by  addition. 


The  equation  of  the  envelope  thus  reads 


x*  +  y* 


or,  in  rational  form, 


(/a  —  x2  —  y3)3  =  27/ 2x2y2. 

(The  second  form  is  obtained  from  the  first  by  cubing  twice.  The  first  cubing 
results  in 


x2  +  y2  +  3  x^y^x*  +  y*)  =  /a 


or 


3  x^y*/*4  =  /a  —  x2  —  y2, 

and  on  the  second  cubing  we  obtain  the  indicated  form.) 

Because  of  its  shape  the  curve  x*»  +  yl>  =  /*  is  called  an  astrois  or  astroid  in 
accordance  with  a  proposal  made  by  J.  J.  Littrow  in  1838  or  a  star  line  after  M. 
Simon’s  proposal. 

The  astroid  is  a  hypocycloid*  in  which  the  radius  of  the  fixed  circle  is  four 
times  that  of  the  rolling  circle. 

Proof.  In  Figure  49,  let  C  be  the  center,  /  the  radius,  the  arc  JT  a  section  of 
the  fixed  circle  ft,  <r  the  rolling  circle  at  the  moment  in  which  it  touches  ft  at  the 
point  T,  so  that  the  center  Z  of  the  rolling  circle  cuts  the  radius  CT  into  the  two 
segments  ZT  =  r=  \l  and  CZ  =  3 r.  Also,  let  M  be  the  point  on  the  circumference 
of  whose  path  we  are  to  follow,  x  its  abscissa  andy  its  ordinate.  We  then  select 
C  as  the  origin  of  the  coordinates  and  draw  the  (horizontal)  x-axis  through  point 
J,  at  which  the  marked  point  was  at  the  beginning  of  its  motion.  The  arcs  JT  of  ft 
and  TM  of  <r  are  then  of  equal  length;  the  sector  angle  W  =  ZfTZM  is  therefore 
four  times  the  sector  angle  w  =  ZJCT.  The  slope  of  the  radius  ZM  from  the 
horizontal  is  4  w-w  =  3  w,  and  the  horizontal  and  vertical  projections  of  ZM  are  r 
cos  3 w  and  r  sin  3 w,  respectively.  The  corresponding  projections  of  CZ  are  3 r 
cos  w  and  3 r  sin  w.  Thus  we  obtain  the  equations  (which  can  be  read  off  the 
figure) 


FIG.  49. 

x  =  3r  cos  w  +  r  cos  3  w, 
y  =  3r  sin  w  —  r  sin  3 w, 


which,  as  a  result  of  the  relationships 


cos  3w  =  4  cos3  w  —  3  cos  w, 
sin  3w  =  3  sin  w  —  4  sin3  w, 


can  be  transformed  into 


x  =  l  cos3  wt  y  =  /  sin3  w. 

In  the  pair  of  equations  obtained  the  coordinates  of  the  hypocycloid  point  x  |  y 
are  represented  as  functions  of  the  so-called  rolling  angle  w. 

To  obtain  the  curve  equation  in  Cartesian  coordinates,  we  solve  for  cos  w  and 
sin  w,  square,  and  add.  Thus,  we  obtain 

x*  +  y*  =  l\ 

i.e.,  the  equation  of  an  astroid,  which  was  to  be  demonstrated. 
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Steiner’s  Three-pointed  Hypocycloid 


To  determine  the  envelope  of  the  Wallace  line  of  a  triangle. 


Solution.  Let  ABC  be  the  given  triangle,  M  the  midpoint,  and  r  the  radius  of 
the  circle  U  circumscribed  about  it. 

A  Wallace  line  of  a  triangle  is  the  line  connecting  the  three  base  points  of  the 
perpendiculars  dropped  from  any  point  P  on  the  circumference  of  the  circle  of 
circumscription  to  the  sides  of  the  triangle. 

We  will  make  M  the  origin  of  an  X-Y  coordinate  system  and  preliminarily 
select  the  X-axis  arbitrarily.  If  we  designate  the  angles  formed  by  the  radii  MA, 
MB,  MC,  MP  with  the  positive  side  of  the  X-axis  as  2a,  2f,  2 y,  2<p,  the 
coordinates  of  the  three  corners  A,  B,  C  are 

(r  cos  2o|r  sin  2a),  (r  cos  2)3 \r  sin  2/5),  (r  cos  2y|r  sin  2y), 

and  the  coordinates  of  the  point  P  are  (r  cos  2<p,  r  sin  2 cp). 

In  order  to  find  the  coordinates  Xt  |  F,  of  the  base  point  Fl  of  the  perpendicular 
dropped  from  P  to  BC,  we  form  the  equations  of  the  line  BC  (in  the  two-point 
form)  and  the  line  PF\  (in  the  slope  form)  and  find  from  these  equations  that 


A'j  =  /(cos  2/3  +  cos  2y  +  cos  2  <p  —  cos  2/3  +  2  y  —  2<p), 
=  /(sin  2/3  +  sin  2y  +  sin  2<p  —  sin  2/3  +  2y  —  2?)), 


where / represents  half  of  r. 

Accordingly,  the  coordinates  X2\ Y2  of  the  base  point  F2  of  the  perpendicular 
dropped  from  P  to  CA  will  naturally  be 


Xa  =  /(cos  2y  +  cos  2a  +  cos  2 <p  —  cos  2y  +  2a  —  2 <p)> 

Y 2  =  /(sin  2y  +  sin  2a  +  sin  2<p  —  sin  2y  +  2a  —  2 <p). 

An  appropriate  parallel  displacement  of  the  coordinate  system  allows  us  to 
put  the  coordinates  into  a  simpler  form.  This  displacement  of  the  coordinate 
system  is  based  upon  Sylvester  s  theorem  (No.  27). 

In  accordance  with  this,  the  altitude  intersection  H  of  the  triangle  ABC  has  the 
coordinates 


r(cos  2a  +  cos  2/3  +  cos  2y)  and  r(sin  2a  +  sin  2/3  +  sin  2y). 

Since  the  center  F  of  the  Feuerbach  circle  lies  halfway  between  M  and  H  (No. 
28),  the  coordinates  of  Fare 

XQ  =  /(cos  2a  +  cos  2/3  +  cos  2y), 

Y0  =  /(sin  2a  +  sin  2/3  +  sin  2y). 

It  is  therefore  convenient  to  select  the  center  of  the  Feuerbach  circle  as  the  origin 
of  the  new  coordinate  system  x,  y.  Between  the  coordinates  X\  Y  of  a  point  in  the 
old  system  and  x\y  in  the  new  system  there  exist  the  relations 

X  -  X0  +  x,  Y  -  Y0  +  y. 

From  these  relations  we  obtain  for  the  coordinates  (x^)  and  (x2\y2)  of  the 
points  Fl  and  F2  in  the  new  system  the  simpler  values 


Xj  =  /(cos  2q>  —  cos  2a  —  cos  2/3  +  2 y  —  2 <p), 

yx  =  /(sin  2 <p  —  sin  2a  —  sin  2fi  +  2y  —  2^) 


and 


x2  =  /(cos  2cp  —  cos  —  cos  2y  +  2a  —  2<p), 
y2  =  /(sin  2<p  —  sin  2/3  —  sin  2y  +  2a  —  2<p). 

Now  the  equation  for  the  Wallace  line  FXF2  reads 

{y  -  -  (ya  -  yi)K*2  -  *1). 

For  the  differences  x2  -  x,  and  y2  -jq  appearing  here,  we  obtain,  in  accordance 
with  the  coordinate  values  just  given,  the  expressions 

x2  —  xx  =  /(cos  2a  —  cos  2/9) 

+  /(cos  2/3  +  2y  —  2<p  —  cos  2y  +  2a  —  2p) 

=  —  2/sin  a  +  /9  sin  a  —  /9 

+  2/sin  a  +  P  +  2y  —  2<p  sin  a  —  /3 
=  4/ sin  a  —  /3  sin  y  —  <p  cos  a  +  /3  +  y  —  9> 

and  similarly 

Va  —  yi  =  'if  **n  “  —  /^  sin  y  —  y  sin  a  +  /3  +  y  — 

The  quotient  (y2  -yi)/(x2  thus  has  the  value  sin  O/cos  O  with  O  =  a  +  /  +  y 
-  (p,  and  the  equation  of  the  Wallace  line  assumes  the  form 

a:  sin  —  y  cos  <l>  =  x1  sin  <I>  —  yl  cos  <I>. 

Using  the  above  values  for  the  coordinates  xx  and  yx  we  are  able  to  write  the 
right  side  of  this  equation  as 

/(sin  <t>  cos  2 <p  —  cos  <t>  sin  2<p)  —  /(sin  <I>  cos  2a  —  cos  <t>  sin  2a) 

—  /(sin  <t>  cos  2/3  +  2y  —  2<p  —  cos  $  sin  2/3  +  2 y  —  2<p ), 

which  expression  becomes,  according  to  the  addition  theorem  of  circular 
functions, 

/sin  (a  +  /9  +  y  —  3<p)  —/sin  (/3  +  y  —  a  —  <p) 

-f  «n  (a  -  p  -  y  +  <p) 

=  /sin  (a  +  /3  +  y  —  3 <p). 

Now  the  equation  of  the  Wallace  line  reads 


x  sin  a  +  j8  +  y  —  <p  —  y  cos  a  +  p  +  y  —  <p 
=  f  sin  a  +  P  +  y  —  3 <p. 


For  the  sake  of  a  final  simplification  we  now  choose  the  position  of  the 
hitherto  arbitrary  x-axis  in  such  manner  that  the  sum  of  the  three  angles  a,  /?,  y  is 
equal  to  an  integral  multiple  of  2n.  It  is  easily  seen  that  with  F  as  the  point  of 
origin  there  are  only  three  rays,  separated  from  each  other  by  angles  of  2n!3,  that 
satisfy  this  condition.  We  choose  one  of  these  three  rays  as  the  x-axis.  In  the 
coordinate  system  thus  determined,  the  Wallace  line  has  the  simple  equation 

(1)  x  sin  <p  +  y  cos  <p  =  /sin  3<p. 

To  interpret  this  equation  geometrically  we  draw  a  triangle  FQR  with  the  side 
FQ  =f  with  the  angles  2 (p  at  F  and  cp  at  R,  thus,  with  the  external  angle  3 cp  at  Q, 
whose  side  FR  lies  on  the  positive  x-axis.  The  side  QR  of  this  triangle  is  then  the 
Wallace  line  $  represented  by  (1).  In  fact:  If  x  =  FU  is  the  abscissa,  y  =  UV  the 
ordinate  of  any  point  V  of  the  line  $,  then  the  perpendicular  FW  dropped  from  F 
to  $  is/ sin  3  (p  as  the  projection  of  FQ;  on  the  other  hand,  as  the  projection  of  the 
traverse  FU  +  UV,  it  is  x  sin  cp  +  y  cos  cp,  so  that  equation  (1)  applies  to  the 
coordinates  of  V. 

In  particular,  if  V  is  the  base  point  of  the  perpendicular  TV  dropped  to  3  from 
the  end  point  T  of  the  extension  QT  =  2/  of  FQ,  V  lies  on  the  circle  f  whose 
center  Z  is  the  midpoint  of  the  hypotenuse  QT  of  the  right  triangle  QTV,  which 
has  the  radius  f  and  which  is  tangent  to  the  Feuerbach  circle  at  Q  and  to  the 
circle  ft  of  center  F  and  radius  3T  at  T.  Since  &  VZT,  as  an  external  angle  of  the 
isosceles  triangle  VZQ,  is  equal  to  6 cp,  the  arc  VT  of  the  circle  r  is  equal  to  f  6 (p. 
And  since  the  arc  JT  stretching  from  the  point  of  intersection  J  of  circle  ft  with 
the  x-axis  to  T  is  equal  to  3/  •  2 (p,  and  is  therefore  also  equal  to  6f(p,  it  follows 
that 


FIG.  50. 


arc  VT  of  l  =  arc  JT  of  Jt. 

If  we  then  think  of  circle  f  as  rolling  along  circle  St  (along  the  inside)  so  that  a 
point  marked  off  on  f  initially  lies  at  J,  the  marked  point  arrives  precisely  at 
point  V  at  the  moment  when  the  rolling  circle  f  assumes  the  drawn  position. 

The  locus  of  point  V  is  consequently,  as  the  path  of  the  marked  point  j(,  a 
hypocycloid  (cf.  No.  52),  in  which  the  radius  of  the  fixed  circle  is  three  times  as 
large  as  the  radius  of  the  rolling  circle.  And  since  at  the  moment  depicted  in  the 
drawing  the  rolling  circle  is  rotating  precisely  about  the  instantaneous  point  of 
rotation  T,  at  this  moment  the  marked  point  m  at  V  is  moving  in  a  direction  QV 
that  is  precisely  perpendicular  to  TV,  i.e.,  the  Wallace  line  s  is  the  tangent  drawn 
to  the  hypocycloid  at  V\  Thus  the  totality  of  Wallace  lines  represents  the  totality 
of  all  the  hypocycloid  tangents. 


FIG.  51. 

Conclusion:  Steiner’s  theorem:  The  envelope  of  the  Wallace  lines  of  a 
triangle  is  a  hypocycloid  whose  fixed  circle  possesses  a  radius  that  is  three  times 
as  great  as  the  radius  of  the  rolling  circle.  The  center  of  the  fixed  circle  is  the 
center  of  the  Feuerbach  circle  of  the  triangle,  and  the  radius  of  the  rolling  circle 
is  equal  to  the  radius  of  the  Feuerbach  circle. 

The  three  points  of  the  hypocycloid — the  three  places  at  which  the  marked 
point  on  the  rolling  circle  touches  the  fixed  circle — are  the  end  points  of  the 
three  radii  of  the  fixed  circle,  separated  from  each  other  by  120°,  of  which  one 
lies  on  the  positive  x-axis. 

The  three  apexes  of  the  hypocycloid — the  three  places  at  which  the  marked 
point  on  the  rolling  circle  touches  the  Feuerbach  circle — divide  the  arcs  of  the 
Feuerbach  circle  lying  outside  the  triangle,  from  the  midpoints  of  the  sides,  into 
segments  whose  ratio  to  each  other  is  as  1:2. 

[This  ratio  follows  easily  from  the  position  of  the  x-axis  and  from  the  fact 
that  the  peripheral  angle  opposite  the  arc  of  a  Feuerbach  circle  cut  off  by  a 
triangle  side  is  equal  to  the  difference  between  the  two  triangle  angles  at  the  end 
points  of  the  side.] 
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The  Most  Nearly  Circular  Ellipse  Circumscribing  a 
Quadrilateral 

Of  all  the  ellipses  circumscribing  a  given  quadrilateral ,  which  deviates  least 
from  a  circle? 

This  problem,  which  was  posed  in  the  seventeenth  volume  of  Gergonne’s 
Annales  de  Mathematiques,  was  solved  by  J.  Steiner  ( Crelle’s  Journal ,  vol.  11; 
also:  Steiner,  Gesammelte  Werke,  vol.  I). 

Solution  (according  to  Steiner).  To  begin  with,  it  is  clear  that  the 
quadrilateral  must  be  convex  in  as  much  as  no  ellipse  can  be  circumscribed 
about  a  concave  quadrilateral. 


Let  OPRQ  be  the  given  quadrilateral,  let  QR  cut  the  extension  of  OP  at  H  and 
PR  cut  the  extension  of  OQ  at  K,  and  let  OP  =  p,  OQ  =  q,  OH  =  h,  OK  =  k.  We 
will  take  OP  as  the  x-axis,  OQ  as  the  y-axis  of  an  oblique-angle  coordinate 
system.  The  equations  for  the  sides  OP  and  OQ  of  the  quadrilateral  are  then  y  = 
0  and  x  =  0,  while  the  equations  for  the  sides  PR  and  QR  are 


and  x  +  -  =  l 
h  q 


or,  if  we  designate  the  expressions 

kx  +  py  —  kp  and  qx  +  hy  —  hq 


as  u  and  v,u  =  0  and  v  =  0. 

The  equation  for  every  ellipse  that  can  be  circumscribed  about  the 
quadrilateral  has  the  form 

(1)  Xxu  +  fiyv  =  0, 


where  A  and  ju  are  two  arbitrary  constants  or  so-called  parameters.  [Since  at  Ox 
=  0  and  y  =  0,  at  Py  =  0  and  u  =  0,  at  Q  x  =  0  and  v  =  0,  and,  finally,  at  R  u  =  0 
and  o  =  0,  the  second  degree  curve  (j  represented  by  (1)  passes  through  all  four 
corners.  Thus,  (£  is  an  ellipse  of  circumscription,  which,  moreover,  also  passes 
through  the  fifth  point  x0\y0,  and  if  we  choose  A  and  fi  in  such  manner  that 


kx0(kx0  +  pyo  -  kp)  +  Moiqxo  +  hy0  -  hk)  =  0, 

thenx0[y0  also  lies  on  g.  Since,  however,  only  one  second  degree  curve  can  pass 
through  five  points,  g  is  the  ellipse  (£.  Thus,  every  ellipse  of  circumscription  can 
be  represented  by  (1).] 

We  introduce  the  values  of  u  and  v  into  (1)  and  obtain  the  equation  of  an 
arbitrary  ellipse  of  circumscription: 

(T)  Ax 2  +  2  Bxy  +  Cy 2  +  2  Dx  +  2  Ey  =  0, 

where 


A  —  k\,  2  B  =  p\  +  qy-,  C  =  hfi,  D  —  —kpX,  E  =  —hqfi. 

We  begin  by  looking  for  the  locus  of  the  centers  of  all  the  parallel  chords  of 
the  ellipse  (1 ') 

(2)  y  =  Jtx  +  n, 

in  which  M  is  the  common  directional  constant  of  the  chords,  n  the  segment 
cut  off  on  the  y-axis  by  one  of  these  chords,  chosen  arbitrarily. 

If  we  introduce;;  from  (2)  into  (1 '),  we  obtain  the  quadratic  equation 

(A  +  2  BJt  +  CJt 2)x2  +  2[(Cn  +  E)Jt  +  Bn  +  D]x  +  Cna 

-j-  2  En  =  0 

for  the  abscissas  Xj  and  x2  of  the  points  of  intersection  of  the  chord  (3)  with  the 
ellipse  (1).  According  to  a  well-known  theorem  from  quadratic  equation  theory, 
the  sum  of  the  two  roots  x1  and  x2  of  this  equation  is 

(j  (Cn  +  En  +  D 

X1  +  *2=  -L  A  +  2  BJt  +  C*2  ’ 


i.e.,  the  abscissa  of  the  chord  midpoint  is 


(CUT  4-  B)n  +  EJt  +  D 
CJt a  4-  2  BJt  4-  /I 


*  = 


Since  the  chord  midpoint  X\  7  satisfies  the  equation  (2)  of  the  chord,  7  =  ..itX  +  n, 
so  that  we  can  substitute  7-  jfX  for  n  in  the  equation  found  for  X  If  we  do  this, 
we  obtain  for  the  coordinates  X  and  7  of  the  chord  midpoint  the  equation 

(3)  Y  =  JtX  +  n\ 

with 


(3*) 


A  4-  BJt  ,  D  +  E.1t 
B  4-  CJt'  "  ~  B  +  CJt' 


Since  (3)  is  the  equation  of  a  straight  line,  the  following  theorem  applies: 

The  midpoints  of  all  the  parallel  chords  of  an  ellipse  possessing  the 
directional  constant  .  tt  lie  on  a  straight  line  (a  diameter  of  the  ellipse)  with  the 
directional  constant  Jt'  The  two  directional  constants  .  tt  and  Jt\  as  well  as  their 
corresponding  directions  and  the  diameters  of  the  ellipse  possessing  this 
direction  are  said  to  be  conjugate  to  each  other. 

We  will  now  prove  two  auxiliary  theorems. 

Auxiliary  theorem  I:  There  is  only  one  pair  of  conjugate  directions 
(diameters)  that  belong  to  all  the  ellipses  circumscribing  a  quadrilateral. 

Proof.  We  replace  A,  B,  C  in  (3a)  with  their  values  and  obtain 

j.,  (2k  +  pJt)  •  A  +  <]./{  ■  n 

"  p- A  +  (2 hJt  + 


If  Jt'  (for  a  prescribed  ..it')  is  to  maintain  the  same  value  no  matter  which 
ellipse  of  circumscription  we  are  concerned  with  and  consequently,  no  matter 
how  great  2  and  g  are,  then  this  value  must  be  obtained  when  2  =  1  and  g  =  0  as 
well  as  when  2  =  0  and  g  =  1 .  Consequently,  it  must  be  true  that 

2k  +  pJt  qJt 
p  2  hJt  +  q 

And  if  we  are  able  to  find  a  suitable  .a  for  this  equation,  then  for  every  2  and 
every  g 


(2k  +  pJt)  A  4-  (2k  +  pjt)fi  =  2k  4-  pJt 
Pk  4-  Pn  p 


-.1C 


or 


(4)  JT  =  -J(  -  2-> 

P 

i.e.,  J{'  is  independent  of  X  and  pi.  The  equation  giving  the  condition  for  .a  is 
written 


hpJP  +  ‘lhkj(  +  kq  =  0 


and  gives  the  two  M -values 


(5) 


J(x  = 


k  r 

-p  +  W 


k  r 
p'  hp 


with  r2  =  h2k2  —  hp-kq  =  hk{hk  —  pq).. 

Since,  according  to  the  drawing,  hk  >  pq,  r2  is  real,  r  is  positive,  and  both 
values  are  real.  Moreover, 

(5a)  Jlx  +  =  — 2  t* 

Now,  according  to  (4),  the  directional  constant  that  is  conjugate  to  Jt ^ 
has  the  value  -J(x  -  2 (k/p),  i.e.,  the  value  .  (( ...  In  like  manner, 

g  .  <(  j . 

Thus,  there  is  only  one  pair  of  specific  directions,  determined  by  the 
directional  constants  and  JC  and  .  ((  >  that  will  form  a  pair  of  conjugate 
directions  for  each  ellipse  of  circumscription. 

Auxiliary  theorem  II:  The  acute  angle  formed  by  two  conjugate  diameters 
of  an  ellipse  attains  a  minimum  when  the  two  conjugate  diameters  are  equal,  and 
the  tangent  of  the  half  angle-minimum  is  equal  to  the  ratio  b:  a  of  the  two  halj 
axes. 

Proof.  If  W  and  cp  are  the  two  acute  angles  that  the  two  conjugate  diameters 
of  an  ellipse  with  the  half  axes  a  and  b  form  with  the  large  axis,  then  obviously 

I  P 

(6)  tan  ip  •  tan  <p  =  -g* 

For  the  angle  Q=  W  +  <p  of  the  two  conjugate  diameters  we  therefore  obtain 


But  the  left  side  of  this  equation,  and  therefore  the  angle  Q,  attains  a  minimum 
when  the  numerator  of  the  right  side  assumes  its  smallest  value.  This  numerator 
is  the  sum  of  two  numbers  (tan  W  and  tan  <p)  of  constant  product  and,  according 
to  No.  10,  attains  a  minimum  when  the  numbers  are  equal.  From  tan  W=  tan  (p  it 
follows  that  W  =  (p  and  from  this  that  the  two  diameters  are  equal.  At  the  same 
time  from  (6)  we  obtain  the  value  b/a  for  the  tangent  of  the  half  angle-minimum. 

These  preliminaries  concluded,  the  solution  of  the  problem  is  simple. 

The  circumscribed  ellipse  becomes  more  and  more  circular,  the  closer  the 
ratio  b:a  of  the  small  to  the  large  half  axis  comes  to  unity.  Now,  according  to 
auxiliary  theorem  II.,  this  ratio  has  the  value  tan  (co/2),  where  co  is  the  smallest 
angle  formed  by  conjugate  diameters.  The  most  nearly  circular  circumscribed 
ellipse  is  therefore  the  ellipse  in  which  co  attains  its  maximum  possible  value. 
And  this  is  the  ellipse  in  which  the  directional  constants  of  its  equal  conjugate 
diameters  are  determined  by  (5).  Thus,  if  co0  is  the  angle  between  the  equal 
conjugate  diameters  of  this  ellipse,  then  for  every  other  ellipse  of 
circumscription,  co0,  as  the  angle  between  two  unequal  conjugate  diameters 
(with  the  directional  constants  j  and  .  #2),  is  greater  than  the  angle  co  of  this 
ellipse  enclosed  between  equal  conjugate  diameters,  so  that  comax  =  co0. 

Consequently: 

Of  all  the  ellipses  circumscribed  about  a  quadrilateral  the  ellipse  that 
deviates  least  from  a  circle  is  the  one  whose  equal  conjugate  diameters  possess 
the  conjugate  directions  common  to  all  the  ellipses  of  circumscription. 

The  directional  constants  of  these  specific  directions  are  determined  by  the 
quadratic  equation 


hpj( 2  +  2  hkJt  +  kq  =  0. 
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The  Curvature  of  Conic  Sections 


To  determine  the  curvature  of  a  conic  section. 


By  the  curvature  of  a  curve  at  a  point  is  meant  the  reciprocal  value  of  the 
radius  of  the  circle  of  curvature,  i.e.,  the  radius  of  the  circle  that  fits  the  curve 
most  closely  at  the  relevant  point. 


Solution.  Let  the  conic  section  be  called  it,  its  parameter  2 p,  its  form 
number  s,  its  shortest  focal  radius  k,  so  thats  p  =  k(  1  +  e),  and  finally,  let  the 
equation  for  its  maximum  be 

qx3  +  y2  —  2 px  —  0,  with  q  =  1  —  e2. 

It  is  known  that  the  coordinates  of  a  point  II(^| rj)  at  a  distance  R  from  another 
point  P{x\y)  and  lying  at  a  direction  from  P  that  forms  the  angle  us  with  the 
positive  x-axis  are 


f  =  x  +  oRy  r)  =  y  +  iRy 

where  o  is  the  cosine  and  i  the  sine  of  v. 

If  II  lies  on  it,  then  from 

q?+v2-  2 pi  =  0 

we  obtain  the  quadratic  equation  for  R 

DR2  -  ER  +  F  =  0 


with  the  coefficients 

D  as  i2  +  qo2,  E  =  2 (ou  —  iy),  F  —  qx2  +  y2  —  2px, 

where  u=p  -  qx. 

In  respect  to  the  conic  section,  we  will  call  the  three  expressions  D,  E,  F  the 
directional  number  for  the  “ direction ”  v,  the  emanant  at  point  x|y  for  the 
direction  v,  and  the  power  at  point  x|y. 

If  PEL  is  a  secant,  the  roots  Rx  and  R2  of  the  quadratic  equation  are  the 

segments  generated  on  the  secant  by  the  conic  section.  The  relations  between  the 
roots  and  the  coefficients  of  a  quadratic  equation  give  us  the  following  theorems: 

I.  The  emanant  is  the  Dth  sum  of  the  secant  segments. 

II.  The  power  is  the  Dth  product  of  the  secant  segments. 

We  now  draw  through  an  arbitrary  point  P{x\y)  of  the  conic  section  the 
tangent  2  and  the  normal  and  designate  the  segment  of  the  normal  from  P  to  the 
x-axis  as  n  and  the  segment  reaching  from  P  to  the  conic  section  as  N.  If  v  is  the 
angle  of  2  with  the  x-axis,  o  the  cosine,  i  the  sine  of  v,  then  the  directional 
number  for  the  tangent  direction  is 


D  =  i2  +  qo2 


(since  u  =  p  -  qx  represents  the  subnormal),  while  for  the  directional  number  of 
the  inward-pointing  normal  we  obtain  the  value 

A  =  oa  +  qi2. 

The  emanant  at  P  for  the  direction  of  the  normals  becomes 

E  =  2  (oy  +  iu)  =  2  n. 


Therefore,  according  to  L, 

(1)  2n  =  A  N. 


On  tangent  x  we  select  a  point  O  whose  distance  OP  from  P  we  set  equal  to 
t;  and  we  draw  through  O  perpendicular  to  j  through  the  conic  section  the  secant 
3.  Let  the  two  segments  of  the  secant  created  by  ft  and  measured  from  O  be  s 
and  let  S  >  s.  According  to  IT,  we  can  write  for  the  power  of  ft  at  O  both  Dt 2  and 
A Ss,  so  that 

(2)  Dt3  =  A Ss. 


We  now  draw  a  circle  f  to  which  for  the  time  being  we  will  attribute  the 
arbitrary  radius  p;  the  center  of  this  circle  lies  on  the  internal  normal  and  the 
circle  is  tangent  to  the  conic  section  at  P.  If  Sq  and  S0  >  s0  are  the  segments 
measured  from  O  that  the  circle  creates  on  the  secant  3,  then,  according  to  the 
tangent  theorem, 

(3)  t2  =  SqS0. 

By  division  of  (2)  and  (3)  we  obtain 

DS0s0  =  A Ss 


and,  using  (1),  we  obtain 


DNSqSq  =  2  nSs. 


Now  the  closer  the  fraction  s/s0  is  to  unity,  the  closer  the  approximation  of 
the  circle  to  the  conic  section  in  the  vicinity  of  point  P.  But  this  fraction, 


according  to  the  last  equation,  has  the  value 

s  N  S0  Dp 
s0  S  2p  n 

In  the  immediate  vicinity  of  the  point  P,  S  becomes  equal  to  N  and  S()  =  2 p,  so 
that  both  the  first  and  second  factors  on  the  right-hand  side  are  equal  to  1. 
Consequently,  the  fraction  s/s0  comes  closest  to  unity  when  the  third  right-hand 

factor  Dfn  is  also  equal  to  1.  Thus:  Of  all  circles  r  the  one  that  most  closely 
approximates  the  conic  section  is  the  one  possessing  the  radius  p  =  n/D. 

Since  D  was  previously  determined  as  equal  to  p2ln2,  we  obtain  the 
fundamental  theorem : 

The  radius  of  curvature  of  a  conic  section  has  the  value 

P  -  n*lp\ 

To  draw  the  circle  of  curvature  we  must  consider  that  p/n  is  the  cosine  of  the 
angle  ¥  formed  by  the  normal  n  with  the  focal  radius  r  of  the  point  P*  and 
accordingly  we  write  the  obtained  formula  as 

p  =  n/cos2  ifi. 

From  inspection  of  this  equation  we  obtain  the  following 

Construction  of  the  radius  of  curvature:  At  the  point  of  intersection  H  of 
the  normal  with  the  x-axis  we  erect  a  perpendicular  to  the  normal.  At  its  point  of 
intersection  K  with  the  (extended)  focal  radius  we  then  erect  the  perpendicular  to 
the  focal  radius.  The  point  of  intersection  Z  of  this  second  perpendicular  with  the 
normal  is  the  center  of  curvature,  its  distance  from  P  the  desired  radius  of 
curvature. 


FIG.  53. 
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Archimedes’  Squaring  of  a  Parabola 


To  determine  the  area  enclosed  in  a  parabola  section. 


The  squaring  of  a  parabola  is  one  of  Archimedes’  most  remarkable 
achievements.  It  was  accomplished  about  240  b.c.  and  is  based  upon  the 
properties  of  Archimedes  triangles. 

An  Archimedes  triangle  is  a  triangle  whose  sides  consist  of  two  tangents  to  a 
parabola  and  the  chord  connecting  the  points  of  tangency.  The  last-mentioned 
side  is  taken  as  the  base  line  or  the  base  of  the  triangle.  In  order  to  construct  such 
a  triangle  we  draw  the  parallels  to  the  parabola  axis  through  the  two  points  H 
and  K  of  the  directrix  and  erect  the  perpendicular  bisectors  upon  the  lines 
connecting  H  and  K  with  the  focus  F.  If  we  designate  the  point  of  intersection  of 
the  two  perpendicular  bisectors  as  S,  the  point  of  intersection  of  the  first 
perpendicular  bisector  with  the  first  parallel  to  the  axis  as  A,  and  the  point  of 
intersection  of  the  second  perpendicular  bisector  with  the  second  parallel  to  the 
axis  as  B,  then  A  and  B  are  points  of  the  parabola  and  SA  and  SB  are  tangents  of 
the  parabola  (classical  construction  of  the  parabola),  and  ASB  is  an  Archimedes 
triangle  (cf.  Figure  39). 


A 


FIG.  54. 

Since  SA  and  SB  are  two  perpendicular  bisectors  of  the  triangle  FHK,  the 
parallel  to  the  axis  through  S  is  the  third  perpendicular  bisector;  it  consequently 
passes  through  the  center  of  HK,  and,  as  the  midline  of  the  trapezoid  AHKB,  it 
also  passes  through  the  center  M  of  AB.  This  gives  us  the  theorem:  The  median 
to  the  base  of  an  Archimedes  triangle  is  parallel  to  the  axis. 

Let  the  parabola  tangents  through  the  point  of  intersection  O  of  the  median 
SM  to  the  base  with  the  parabola  cut  SA  at  A',  SB  at  B'.  Then  AA'O  and  BB'O  are 
also  Archimedes  triangles.  Consequently,  according  to  the  above  theorem,  the 
medians  to  their  bases  are  also  parallel  to  the  axis  and  are  therefore  also  parallel 
to  SO.  These  medians  are  therefore  midlines  in  the  triangles  SAO  and  SBO,  so 
that  A'  and  B'  are  the  centers  of  SA  and  SB.  A'B'  is  consequently  the  midline  of 
the  triangle  SAB  and  is  therefore  parallel  to  AB;  also  the  point  O  on  A'B'  must  be 
the  center  of  SM. 

The  result  of  our  investigations  is  the 

Theorem  of  Archimedes:  The  median  to  the  base  of  an  Archimedes  triangle 
is  parallel  to  the  axis,  the  midline  parallel  to  the  base  is  a  tangent,  and  its  point 


of  intersection  with  the  median  to  the  base  is  a  point  of  the  parabola. 

Now  we  can  determine  the  area  J  of  the  parabola  section  enclosed  in  our 
Archimedes  triangle  ASB  with  the  base  line  AB. 

The  tangents  A  B '  and  the  chords  OA  and  OB  divide  the  triangle  ASB  into  four 
sections:  1.  the  “internal  triangle”  AOB  enclosed  within  the  parabola;  2.  the 
“external  triangle”  A  'SB'  lying  outside  the  parabola;  3.  and  4.  two  “residual 
triangles”  AOA'  and  BOB',  which  are  also  Archimedes  triangles  and  are 
penetrated  by  the  parabola. 

Since  O  lies  at  the  center  of  SM,  the  internal  triangle  is  twice  the  size  of  the 
external  triangle. 

In  the  same  fashion,  each  of  the  two  residual  triangles  in  turn  gives  rise  to  an 
internal  triangle,  an  external  triangle  and  two  new  residual  Archimedes  triangles 
that  are  penetrated  by  the  parabola,  and  once  again  each  internal  triangle  is  twice 
the  size  of  the  corresponding  external  triangle. 

Thus,  we  can  continue  without  end  and  cover  the  entire  surface  of  the  initial 
Archimedes  triangle  ASB  with  internal  and  external  triangles.  The  sum  of  all  the 
internal  triangles  must  also  be  twice  as  great  as  the  sum  of  all  the  external 
triangles.  In  other  words: 

Theorem  of  Archimedes:  The  parabola  divides  the  Archimedes  triangle  into 
sections  whose  ratio  A  2:1. 

Or  also: 

The  area  enclosed  by  a  parabola  section  is  two  thirds  the  area  of  the 
corresponding  Archimedes  triangle. 

Archimedes  arrived  at  this  conclusion  by  a  somewhat  different  method.  He 
found  the  area  of  the  section  by  adding  together  the  areas  of  all  the  successive 
internal  triangles. 

If  A  represents  the  area  of  the  initial  Archimedes  triangle  ASB ,  then  the  area 
of  the  corresponding  internal  triangle  is  one  half  A,  the  area  of  the  corresponding 
external  triangle  is  one  quarter  of  A,  and  the  area  of  each  of  the  two  residual 
triangles  is  one  eighth  of  A.  The  successive  Archimedes  triangles  therefore  have 
the  areas 


A, 


A  A 

IT  8a’ ' 


> 


the  corresponding  internal  triangles  possess  half  this  area;  and  since  each 
internal  triangle  gives  rise  to  two  new  internal  triangles,  we  thus  obtain  for  the 
sum  of  all  the  successive  internal  triangle  areas  the  value 


i[ 


A  +  2.|  +  4^+8| 


The  bracket  encloses  a  geometrical  series  with  the  quotient  f,  the  sum  of 
which  is  equal  to  A/(l  -  f)  =  f.A.  Thus,  we  again  obtain  for  the  area  of  the 
section  the  value  J=  $  A. 

Since  A  B'  is  tangent  to  the  parabola  at  0,  the  perpendicular  h  dropped  from  O 
to  the  base  line  AB  of  the  section  is  the  altitude  of  the  section.  Since  h  is  also  half 
the  altitude  of  the  triangle  ASB,  A  =  AB  ■  h  and  J=$  ■  AB  ■  h,  i.e.: 

The  area  enclosed  by  a  parabola  section  is  equal  to  two  thirds  the  product  of 
the  base  and  the  altitude  of  the  section. 

Finally,  we  will  express  the  area  of  the  section  in  terms  of  the  transverse  q  of 
the  section,  i.e.,  by  the  projection  normal  to  the  axis  of  the  chord  bounding  the 
section. 


FIG.  55. 


We  use  the  equation  for  the  amplitude  of  the  parabola,  calling  the  coordinates 
of  the  corners  of  the  section  x\y  and  X\  Y,  and  we  have 

y2  =  2 px  and  Y2  =  2 pX 

with  2 p  representing  the  parameter.  From  Figure  55  it  follows  directly  that 

J  -  iXY  -  j«y  -  (Jf  -  x)l+J.. 

If  we  replace  X  and  x  here  with  Y2/2p  and  y2/2p,  we  obtain  12 pJ=  Y3  -  y3  - 
3  Y2y  +  3Fy2  =  (Y - y )3.  Since  Y -y  is  the  section  transverse  q,  we  finally  obtain 


12 pJ  =  q\ 


This  important  formula  can  be  expressed  verbally  as  follows: 

Six  times  the  product  of  the  parameter  and  the  area  of  the  section  is  equal  to 
the  cube  of  the  section  transverse. 
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Squaring  a  Hyperbola 


To  determine  the  surface  area  enclosed  by  a  section  of  a  hyperbola. 


We  select  the  major  axis  of  the  hyperbola  as  the  x-axis,  the  minor  axis  as  the 
v-axis;  the  hyperbola  equation  then  reads 


(1) 


where  a  and  b  are  half  the  major  and  minor  axes,  respectively. 

We  must  find  the  area  A  of  the  hyperbola  section  cut  off  at  a  distance  of  x 
from  the  apex  of  the  hyperbola  by  the  hyperbola  chord  2 y  that  is  normal  to  the  x- 
axis  (Figure  56).  The  coordinates  for  the  corners  of  the  section  H  and  K  are  thus 
x\y  andx|  -y. 

First  we  determine  the  area  T  of  a  so-called  hyperbola  trapezoid,  i.e.,  the 
trapezoidal  surface  that  is  bounded  by  a  hyperbola  arc,  the  parallels  to  one  of  the 
asymptotes  through  the  end  points  of  the  arc,  and  the  segment  cut  off  on  the 
other  asymptote  by  these  parallels. 

Let  the  asymptote  angle  be  2a,  its  sine  J,  the  sine  and  cosine  of  its  halves  i 
and  o,  so  that  i  =  b/e  and  o  =  ale  (with  e  =  Va2  +  b2)  and  J=  2 io  =  2 able2  (Figure 
56). 


FIG.  56. 

We  choose  as  the  asymptotes  the  u-  and  u-axis  of  a  second  (obliqueangle) 
coordinate  system.  Between  the  coordinates  x\y  and  u\v  of  a  hyperbola  point  in 
the  two  systems  there  then  exist  the  transformation  equations 

(2)  x  =  ou  +  ov,  y  =  iv  —  iu, 


as  may  be  seen  from  Figure  57,  so  that  for  the  left  side  of  (1)  we  obtain  the  value 
4 uv/e2  and  we  have  the  equation  of  the  hyperbola  in  the  second  system,  the  so- 
called  asymptote  equation  of  the  hyperbola 

(3)  uv  =  P  with  P  =  le2. 


in  which  P  is  the  so-called  power  of  the  hyperbola. 


FIG.  57. 


Let  the  trapezoid  T  to  be  calculated  be  bounded  by  the  hyperbola  arc  with  end 
point  coordinates  u\v  and  U\  V  (where  we  let  U  >  u,  V<  v),  by  the  two  ordinates  v 
and  V  and  by  the  base  line  U-u  of  the  trapezoid  (Figure  58). 

We  divide  the  trapezoid  into  n  equal  sections  t  by  means  of  parallels  to  the  v- 
axis,  so  that  T  =  nt,  and  we  designate  the  coordinates  of  the  points  marking  off 
the  segments  on  the  trapezoid  arc  as  u{\vh  u2\v2,  . . un  _  x\vn  _  x. 


FIG.  58. 

The  asymptote  parallels  through  the  end  points  u|o  and  U|33  of  the  hyperbola 
arc  corresponding  to  an  arbitrary  trapezoidal  section  t  determine  two 
parallelograms  with  a  common  base  line  g  =  it  -  u  lying  on  the  w-axis,  one  of 
which  is  larger  and  the  other  smaller  than  t.  Since  these  parallelograms  possess 


the  areas  Jgx>  and  /g®,  we  obtain  the  inequality 

Jgv  >  t  >  Jg®. 


We  introduce  the  so-called  quotient  of  the  trapezoid  t,  q  =  U/u,  replace  g  on 
the  left  by  (q  -  l)u  and  on  the  right  by  [1  -  (l/g)]u,  and  obtain 

J(q  -  1  )ul)  >  t>  j(l  -  ^U® 

or,  as  a  result  of  (3), 

PJ(1-  1)  ><>  w(l  -t)- 


If  we  replace  t  here  with  Tin ,  divide  by  PJ  and  abbreviate  T/PJ  as  c,  we  obtain 


?  - 


or,  solving  for  q, 


1  +  -  <  q  <  — - - 

*  l--‘ 

n 


Using  this  inequality  for  all  n  trapezoidal  sections,  we  obtain  the  n 
inequalities 


,  C  tt,  1 

1  +  -  <  —  < - , 

n  u  j  c 

n 


1  +  -  <  ^  < 
n  ux 


1 


1  -  - 
n 


,  f  U  1 
1  +  -  < -  < 


n  u 


-1  1  -  : 

n 


Multiplication  of  these  gives 


u 

<  —  < 
u 


1 


The  mean  of  this  inequality  is  the  so-called  quotient  Q  =  U/u  of  the  hyperbola 
trapezoid  T.  The  left  and  right  side  tend  (according  to  No.  12)  toward  the  value 
ec  for  infinitely  increasing  n,  e  representing  the  Euler  number  (2.71828...).  This 
gives  us  the  equality 


With  logarithms  we  obtain 

(I)  T  =  PJIQ, 
or  verbally: 

The  area  of  the  hyperbola  trapezoid  is  proportional  to  the  natural  logarithm 
of  the  trapezoid  quotient. 

The  proportionality  constant  is  the  product  of  the  hyperbola  power  and  the 
sine  of  the  asymptote  angle. 

Since  4 P  =  e2,J=  2 able2,  we  also  have 
(I«)  IQ. 

If  we  join  the  end  points  u\v  and  U\  V  of  our  hyperbola  arc  with  the  hyperbola 
center  O,  we  obtain  a  hyperbola  sector  to  which  we  can  similarly  assign  the 
“quotient”  Q.  Since  the  two  triangles  that  are  formed  by  the  connecting  lines 
mentioned  and  the  coordinates  of  the  end  points  of  the  arc  have  the  areas  \uvJ 
and  \UVJ  which  areas  are  equal  in  view  of  (3),  the  sector  has  the  same  area  S  as 
the  trapezoid: 

(II)  S  =  PJIQ  -  IQ. 

Now  the  determination  of  the  area  of  the  section  A  is  simple.  First,  in 
accordance  with  (2),  the  abscissas  u  and  U  of  the  section  corners  H  and  K  are 
found  to  be 


i(«-s)  lnd  u 


i(M>- 


From  this  it  follows  that  the  quotient  of  the  sector  OHK  is 


f  +  2 

u  x  y 


-e+ff 


[cf.  (1)] 


and,  consequently,  the  area  of  the  sector,  according  to  (II),  is 


S  =  abl 


(M)- 


Finally,  A  is  found  to  be  the  amount  by  which  the  triangle  OHK  is  greater 
than  the  sector  OHK ,  or 


(III) 


A  -  *  -  abl  (?  +  |). 


58 


Rectification  of  a  Parabola 


To  determine  the  length  of  a  parabola  arc. 


Solution.  The  following  ingenious  solution  to  this  problem  stems  from  the 
famous  book  Lectiones  Geometricae  of  the  English  mathematician  Isaac  Barrow 
(1630-1677),  which  was  published  in  1670  in  London.  We  refer  the  parabola  to 
a  coordinate  system  in  which  the  x-axis  is  the  axis  of  the  parabola  and  the  y-axis 
is  tangent  to  the  apex.  The  parabola  equation  then  reads  y2  =  2 px.  We  need  only 
determine  the  length  of  an  “apex  arc,”  i.e.,  an  arc  of  the  parabola  that  takes  its 
origin  from  the  apex  S,  since  any  arc  can  be  represented  as  the  sum  or  difference 
of  apex  arcs.  Let  the  end  point  P  of  the  apex  arc  SP  possess  the  coordinates  X 
and  Y,  and  let  the  sought-for  length  of  the  arc  be  L. 

Since  the  subnormal  of  a  parabola  is  equal  to  the  half  parameter  p,  there 
exists  between  the  ordinate  y  of  a  point  of  the  parabola  and  the  normal  n 
corresponding  to  this  point  the  relation 


n2  -  y2  =  p2. 


If  we  then  assign  to  each  parabola  point  x\y  of  our  coordinate  system  a  point  n\y 
in  a  new  «[y-coordinate  system,  we  obtain  in  the  new  system  an  equilateral 
hyperbola  with  the  half  axis  p. 

We  show  that  p  times  the  length  (pL)  of  the  parabola  arc  SP  is  numerically 
equal  to  the  surface  area  F  of  the  hyperbola  trapezoid  that  is  bounded  by  the 
hyperbola,  its  axes,  and  the  perpendicular  N  that  is  dropped  from  the  hyperbola 
point  P'  corresponding  to  the  point  P  onto  the  minor  axis  of  the  hyperbola.  (N  is 
at  the  same  time  the  abscissa  of  the  hyperbola  point  P'  and  the  parabola  normal 
at  the  parabola  point  P.) 


Let  us  consider  a  portion  a  =  AB  of  the  parabola  arc  SP  that  is  short  enough  to 
be  considered  a  rectilinear  distance  (a  so-called  arc  element)  and  let  us  draw 
through  its  end  points  the  parallel  AC  to  the  parabola  axis  and  BC  =  rj  to  the  apex 
tangent.  At  the  same  time  we  draw  the  ordinate  y  and  the  normal  n  of  the 
midpoint  of  AB,  which  gives  us  a  right  triangle  with  the  sides  y,  n,  and  p  that  is 
similar  to  the  triangle  ABC.  As  a  result  of  this  similarity  we  obtain  the  proportion 
r/:o  =p:n,  and  this  gives  us  the  equation 

(1)  pa  =  mj. 

We  then  draw  from  the  hyperbola  points  A'  and  B'  corresponding  to  the  points  A 
and  B  the  perpendiculars  to  the  minor  axis  of  the  hyperbola,  and  we  obtain  a 
narrow  hyperbola  trapezoid  that  corresponds  to  the  arc  A  B'.  The  area  cp  of  this 
trapezoid  is  the  product  of  its  altitude  rj  and  its  midline  n  (the  latter  is  n  because 
it  passes  through  the  center  of  the  altitude  and  thus  through  the  end  point  of  the 
hyperbola  ordinate  y): 


(2) 


<p  =  nrf 


From  (1)  and  (2)  we  get 


pa  =  <p. 

If  we  form  this  equation  for  each  element  of  the  parabola  arc  SP  and  its 
corresponding  minute  hyperbola  trapezoid,  and  if  we  add  the  resulting  equations, 
we  obtain  on  the  left  p  times  the  arc  length  L  and  on  the  right  the  area  F  of  the 
hyperbola  trapezoid  above  described,  i.e.,  the  equation 

pi  =  F. 

Now  from  the  concluding  formula  of  No.  57  it  follows  that 

+  £/£+! 

2  2  p 


The  sought-for  arc  length  is  thus 


NY  p  N  +  Y 

2f+r  p 

where  Y  represents  the  ordinate,  N  the  normal  of  the  end  point  of  the  arc. 

We  now  slightly  transform  the  equation  we  have  found. 

Let  T  be  the  portion  of  the  parabola  tangent  passing  through  P,  bounded  by  P 
and  the  y-axis,  let  r  be  the  slope  angle  of  the  parabola  at  point  P,  i.e.,  the  angle 
formed  by  the  tangent  with  the  x-axis  (and,  at  the  same  time,  by  the  normal  N 
with  the  y-axis).  Then 


NY  YY  _  X 
2 p  2 p  cos  r  cos  t 


and 


N  +  Y  N  +  Ncos  r 


1  +  cos  r 


2  cos3  ^ 

o  •  T  T 
2  sin  ^  cos  ^ 


cot  p 


P 


N  sin  r 


sin  r 


consequently 


L  =  T  +  kl  cot 

where  we  have  replaced  \p  by  the  shortest  focal  radius  k. 

Conclusion  :  An  apex  arc  of  a  parabola  exceeds  the  length  of  the  parabola 
tangent  reaching  from  the  end  of  the  arc  to  the  apex  tangent  by  a  quantity  that  is 
proportional  to  the  natural  logarithm  of  the  cotangent  of  half  the  slope  angle. 
The  proportionality  constant  is  the  shortest  focal  radius. 


Desargues’  Homology  Theorem  (Theorem  of  Homologous 
Triangles) 


If  the  lines  connecting  the  homologous  vertexes  of  two  triangles  pass  through 
a  point,  the  points  of  intersection  of  the  homologous  sides  lie  on  a  straight  line. 

And  conversely: 

If  the  points  of  intersection  of  the  homologous  sides  of  two  triangles  lie  on  a 
straight  line,  the  lines  connecting  the  homologous  vertexes  pass  through  a  point. 

One  frequently  has  occasion  to  correlate  to  each  other  the  vertexes  and  sides 
of  two  triangles  (e.g.,  similar  triangles),  and  in  these  cases  for  the  sake  of 
convenience  the  mutually  correlated,  so-called  “homologous”  vertexes  and  sides 
are  usually  designated  by  the  same  letter.  Thus,  one  may  have,  for  example,  the 
homologous  vertexes  A  and  A',  B  and  B',  and  finally  C  and  C',  as  well  as  the 
homologous  sides  BC  =  a  and  B'C'  =  a',  CA  =  b  and  C'A'  =  b',  and  finally  AB  =  c 
and  A'B'  =  c'. 

Two  such  triangles,  for  which  we  will  assume  that  no  pair  of  homologous 
vertexes  or  sides  coincides,  are  called  copolar  [perspective  from  a  point]  when 
the  lines  AA',  BB',  CC  connecting  the  homologous  vertexes  pass  through  one 
point,  the  so-called  homology  pole.  They  are  called  coaxial  [perspective  from  a 
line]  when  the  points  of  intersection  aa',  bb',  cc'  of  the  homologous  sides  he  on  a 
straight  line,  the  so-called  homology  axis. 

Using  these  terms,  the  above  theorem  can  be  expressed  in  the  abbreviated 
form  of: 

Desargues’  homology  theorem  :  Copolar  triangles  are  coaxial,  coaxial 
triangles  are  copolar. 

Triangles  that  are  both  copolar  and  coaxial  are  called  homologous  triangles. 


The  theorem  of  homologous  triangles  was  discovered  by  the  French 
mathematician  and  engineer  Gerard  Desargues  (1593-1662)  in  about  1636  and  is 
therefore  known  as  Desargues’  theorem.  However,  according  to  the  Greek 
mathematician  Pappus,  this  theorem  was  already  contained  in  the  lost  treatise  on 
porisms  of  Euclid. 

Desargues’  theorem  plays  a  very  important  role  in  projective  geometry. 
Consequently,  we  will  prove  it  in  a  projective  manner  though  other,  shorter 
proofs  are  possible. 

For  the  reader  unfamiliar  with  projective  geometry  it  may  be  appropriate  to 
provide  a  short  exposition  of  its  most  important  concepts  and  its  simplest 
theorems,  especially  as  they  will  be  encountered  in  the  next  few  sections  as  well. 

The  totality  of  the  points  (considered  as  rigidly  connected  to  each  other)  in  a 
line  is  called  a  range  of  points;  the  line  is  called  the  base  of  the  range.  The 
totality  of  the  lines  (considered  as  rigidly  connected  to  each  other)  that  pass 
through  one  point  is  called  a  ray  pencil;  the  point  is  called  the  center  of  the 
pencil.  Similarly,  the  totality  of  the  points  of  a  circle  or,  more  generally,  of  a 
conic  section  is  called  a  circular  or  conic  range  of  points  or  field  of  points;  the 
totality  of  the  tangents  of  a  conic  section  is  called  a  field  of  tangents  of  a  conic 
section.  Ranges  of  points,  pencils,  and  tangent  families  are  the  basic  structures 
of  plane  projective  geometry,  and  the  points,  rays,  and  tangents  are  the  elements 
of  the  corresponding  structures. 

Two  basic  figures  are  called  projective  (symbol:  A-)  when  their  elements  are 
unequivocally  related  to  each  other  in  such  manner  that  every  four  elements  of 
the  one  figure  and  the  four  corresponding  or  “homologous”  elements  of  the  other 
have  the  same  double  ratio.  The  relation  existing  between  the  figures  is  called 
projectivity. 

[The  cross  ratio  (ABCD)  of  four  points  A,  B,  C,  D  of  a  straight  line  is  the 
ratio 


AC  .AD 
BC :  BDr 

the  cross  ratio  (abed)  of  four  rays  a,  b,  c,  d  of  a  pencil  is  the  ratio 

sin  ac  %  sin  ad 
sin  be  *  sin  bd 


The  cross  ratio  of  four  points  of  a  circle  is  the  cross  ratio  of  the  four  rays  that 
connect  the  four  points  with  a  fifth  point  of  the  circle,  where  (according  to  the 


boundary  angle  theorem)  this  fifth  point  can  be  chosen  at  pleasure.  The  cross 
ratio  of  four  points  of  a  conic  section  is  similarly  the  cross  ratio  of  the  four  rays 
that  join  the  four  points  with  an  arbitrarily  chosen  fifth  point  of  the  conic  section 
(cf.  No.  61).  Finally,  the  ratio  of  four  conic  section  tangents  is  the  cross  ratio  of 
their  points  of  tangency.] 

A  projectivity  is  completely  determined  if  three  elements  of  one  structure  and 
the  corresponding  elements  of  the  other  are  given. 

Two  projective  structures  are  called  conjective  when  their  bases  (or  centers) 
coincide. 

A  particularly  important  case  of  projectivity  is  perspectivity.  A  range  of 
points  and  a  ray  pencil  are  called  perspective  (tt)  when  each  element  of  the 
range  lies  on  the  corresponding  element  of  the  pencil.  Each  ray  is  called  the 
reflection  of  the  homologous  point,  the  whole  pencil  is  called  the  reflection  of 
the  range.  Two  nonconjective  ranges  are  called  perspective  (symbol:  a)  when 
the  lines  connecting  the  homologous  points  pass  through  one  point,  the  center  of 
perspectivity.  Two  ray  pencils  are  called  perspective  if  every  pair  of 
corresponding  rays  intersect  on  one  straight  line,  the  axis  of  perspectivity. 

The  projectivity  of  two  perspective  figures  follows  from 

Pappus’  theorem:  The  cross  ratio  of four  rays  of  a  pencil  is  equal  to  the  cross 
ratio  of  the  four  points  at  which  an  arbitrary  line  cuts  the  rays. 

(Pappus  of  Alexandria,  fourth  century  a.d.,  Collectiones  mathematicae .) 

Proof.  Let  A,  B,  C,  D  be  the  four  points  of  intersection  of  a  line  with  the 
pencil  of  four  rays  OA  =  a,  OB  =  b,  OC  =  c,  OD  =  d.  We  designate  the  sine  of 
the  angle  formed  by  two  rays,  for  example,  a  and  c,  with  each  other  as  sine  ac. 
Since  the  perpendiculars  from  A  and  B  to  c  have  the  lengths  a  sin  ac  and  b  sin  be 
and  are  in  the  same  ratio  as  AC  to  BC,  we  obtain  the  proportion 

a  sin  ac :b  sin  be  =  AC'.BC. 


Similarly, 


a  sin  ad:b  sin  bd  =  AD:BD. 
By  division  of  these  two  equations  we  obtain 


sin  ac  _  sin  ad  _  AC  AD 
sin  be  ’  sin  bd  BC  '  BD 


Q..E.D. 


Two  projective  ranges  or  pencils  can  always  be  brought  into  a  perspective 


position. 

Two  projective  ranges  (pencils)  become  perspective  when  they  are  placed  in 
such  a  way  that  an  element  of  one  range  (pencil)  falls  on  the  homologous 
element  of  the  other  range  (pencil),  though  the  bases  (centers)  do  not  coincide. 
We  have  the  following  two  important  theorems: 

I.  If  in  the  projectivity  between  two  ranges  the  point  of  intersection  of  the  two 
bases  corresponds  to  itself  the  ranges  are  perspective. 

II.  If  in  the  projectivity  between  two  pencils  the  line  connecting  the  two 
centers  corresponds  to  itself  the  pencils  are  perspective. 

Proof  of  I.  Let  the  bases  of  the  two  ranges  be  ft  and  S'  their  point  of 
intersection  that  corresponds  to  itself  O  =  O'.  On  ft  we  choose  two  fixed 
elements  A,  B  and  an  arbitrary  point  P  and  we  designate  the  homologous 
elements  on  ft'  as  A',  B',  and  P' .  We  find  the  point  of  intersection  S  of  the  lines 
AA'  and  BB'  and  assign  to  the  lines  connecting  the  designated  elements  with  S 
the  same  letters,  but  in  lower  case.  Then,  according  to  Pappus, 


s 


(oabp)  =  ( OABP )  and  (o'a'b'p')  =  ( O'A'B'P '). 

But  since  the  right  sides  of  these  equations  are  equally  great,  according  to  our 
assumption,  it  follows  that 


(o'a'b'p')  =  (oabp). 

But  if  two  equal  cross  ratios  agree  in  the  first  three  elements  ( O  =  o,  a'  =  a,  b'  = 
b ),  then  they  also  agree  in  the  fourth.  Consequently,  p'  falls  on  p,  and  thus  PP' 
passes  through  S,  and  the  ranges  are  perspective. 

Proof  of  II.  Let  the  centers  of  the  two  projective  pencils  3  and  3'  be  Z  and 


Z',  their  self-corresponding  connecting  line  o  =  o'.  We  select  on  3  two  fixed 
elements  a  and  b  and  an  arbitrary  element  p  and  designate  the  homologous 
elements  of  3'  as  a',  b',  and  p'.  We  find  the  connecting  line  g  of  the  points  aa' 
and  bb'  and  assign  to  the  points  of  intersection  of  the  designated  elements  with  g 
the  same  letters,  but  capitals.  Then,  according  to  Pappus, 

{oabp)  =  ( OABP )  and  (o'a'b'p')  =  (< 0'A'D'P '). 

But  since  the  left  sides  of  these  equations  are  equal,  in  accordance  with  our 
initial  assumption, 


(1 O'A’B'P ')  =  ( OABP ). 

But  if  two  equal  cross  ratios  agree  in  the  first  three  elements  (O'  =  O,  A'  =  A,  B' 
=  B ),  they  also  agree  in  the  fourth.  P'  therefore  falls  on  P,  p  and  p'  thus  intersect 
on  g,  and  the  pencils  3  and  3'  are  perspective. 


V 


The  proof  of  Desargues  ’  theorem  is  now  easily  obtained  (Figure  62).  We  call 
the  vertexes  of  one  triangle  A,  B,  C,  the  sides  opposite  them  a,  b,  c,  the 
homologous  vertexes  of  the  other  triangle  A',  B',  C',  the  sides  opposite  them  a',  b 
',  c'. 

Let  the  points  of  intersection  of  the  homologous  sides  a  and  a',  b  and  b',  c 
and  c'  be  X,  Y,  and  Z,  respectively,  and  let  the  points  of  intersection  of  the  line 
CC'  with  the  two  lines  AB  and  AB'  be  H  and  H'. 

The  proof  divides  into  two  parts. 

1.  We  assume  that  the  connecting  lines  AA',  BB',  CC'  pass  through  one  point 
O.  We  project  the  range  of  points  AB  from  O  onto  A'B'  and  obtain  two 
perspective  ranges  in  which  the  elements  A,  B,  H,  Z  of  the  first  are  homologous 


to  the  elements  A',  B',  H ',  Z'  =  Z  of  the  second.  We  then  connect  the  points  of 
these  ranges  with  C  and  C',  thereby  obtaining  two  projective  ray  pencils  in 
which  the  elements  CA,  CB,  CH  =  CC',  CZ  correspond  to  the  elements  C'A',  C'B 
',  C'H'  =  C'C,  C'Z'.  Since  the  line  CC  connecting  the  pencil  centers  corresponds 
to  itself  in  this  projectivity,  the  projectivity  of  the  pencil  is  perspective  and  the 
points  of  intersection  of  the  homologous  rays  lie  on  a  straight  line.  Thus,  for 
example,  the  points  of  intersection  Y  (of  CA  and  C'A'),  X  (of  CB  and  C'B  %  and  Z 
(of  CZ  and  C'Z1)  lie  on  a  straight  line. 

11.  We  assume  that  the  points  aa'  (. X ),  bb'  (7),  cc'  (Z)  lie  on  a  straight  line  g. 
We  connect  the  points  of  the  line  g  with  C  and  C',  thereby  obtaining  two 
perspective  ray  pencils  in  which  the  elements  a,  b,  CC',  CZ  of  the  first  pencil 
correspond  to  the  elements  a',  b',  CC',  CZ'  =  CZ  of  the  second.  We  cut  these 
pencils  with  the  lines  c  and  c'  and  obtain  two  projective  ranges  in  which  the 
elements  B,  A,  H,  Z  of  the  first  range  correspond  to  the  elements  B',  A',  H',  Z'  =  Z 
of  the  second.  Since  the  point  of  intersection  Z  =  Z'  of  the  range  bases 
corresponds  to  itself  in  this  projectivity,  the  ranges  are  perspective  and  the 
connecting  lines  BB' ,  AA' ,  and  HH'  =  CC'  of  the  homologous  elements  thus  pass 
through  one  point,  which  was  to  be  proved. 


FIG.  62. 
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Steiner’s  Double  Element  Construction 


To  draw  the  double  elements  of  a  conjective  projection  that  are  given  by  three 
pairs  of  homologous  elements. 

A  double  element  of  a  conjective  projectivity  is  an  element  that  coincides 
with  its  homolog. 

The  following  simple  solution  to  this  fundamental  problem  of  projective 
geometry  was  discovered  by  the  German  mathematician  Jakob  Steiner  ( Die 
geometrischen  Konstruktionen,  etc.  [cf.  No.  34],  Berlin,  1833). 

Steiner’s  double  element  construction  enriched  the  geometry  of  antiquity  by 
providing  it  with  a  new  and  fruitful  method  for  solving  problems  of  geometric 
construction.  This  so-called  method  of  false  position  {regula  falsi )  is  based  on 
the  theorem: 

If  in  the  projectivity  between  two  ray  pencils  the  line  connecting  the  pencil 
centers  corresponds  to  itself  the  pencils  are  perspective  (No.  59). 

We  can  distinguish  three  cases: 

1.  Double  elements  of  a  projectivity  on  a  circle.  Let  the  projectivity  between 
the  two  ranges  of  points  <h  and  <h  '  of  the  circle  ft  be  given  by  the  two 
corresponding  point  triplets  (A,  B,  C)  and  (A',  B\  Cj.  We  consider  the  ray 
pencils  3  and  3',  whose  rays  run  from  the  points  of  ranges  <h  and  *h', 
respectively,  through  the  centers  A'  and  A,  respectively.  Since  {R  tt  3  and  <h  '  a  3 
',  and,  according  to  our  assumption,  <r  <h  it  is  also  true  that  3773'.  But  since  in 
the  line  AA'  connecting  the  centers  of  the  two  pencils  3  and  3'  corresponding 
pencil  elements  coincide,  the  latter  projectivity  is  a  perspectivity.  The  axis  of 
perspectivity  is  the  line  0  connecting  the  point  of  intersection  of  the  rays  A  B  and 
AB'  with  the  point  of  intersection  of  the  rays  A'C  and  AC.  Two  corresponding 
rays  of  3  and  3'  thus  always  intersect  at  0 .  Thus,  in  order  to  obtain  a  point  P’  of 
<H '  corresponding  to  the  arbitrary  point  P  of  {R ,  we  need  only  connect  the  point 
of  intersection  of  AP  and  0  with  A.  The  connecting  line  touches  ft  at  P'.  If  we 
carry  out  this  construction  for  the  points  of  intersection  H  and  K  of  the 
perspectivity  axis  with  the  circle,  H'  falls  on  H ,  K'  on  K.  The  double  points  of  the 
projectivity  on  a  circle  are  therefore  the  points  of  intersection  of  the  circle  with 
the  above  perspectivity  axis. 


A' 


FIG.  63. 

II.  Double  elements  of  two  ray  pencils.  We  draw  a  circle  ft  through  the 
common  center  of  the  two  projective  pencils  and,  in  accordance  with  I.,  we  draw 
the  double  points  of  the  two  ranges  at  which  the  rays  of  the  two  pencils  cut  ft. 
The  pencil  rays  passing  to  these  double  points  are  the  double  rays  we  are  looking 
for. 

III.  Double  elements  of  two  ranges  of points.  We  draw,  in  accordance  with  II., 
the  double  rays  of  the  two  pencils  that  are  obtained  from  the  lines  connecting  the 
points  of  the  two  conjective  projective  ranges  with  an  arbitrary  center  Z  outside 
the  base  of  the  range.  The  points  of  intersection  of  the  two  double  rays  with  the 
base  of  the  range  are  the  double  points  we  are  looking  for. 


61 


Pascal’s  Hexagon  Theorem 


To  demonstrate  that  the  three  points  of  intersection  of  the  opposite  sides  of  a 
hexagon  inscribed  in  a  conic  section  lie  on  a  straight  line. 


A  hexagon  inscribed  in  a  conic  section  essentially  consists  of  six  points 
anywhere  on  the  conic  section  1,  2,  3,  4,  5,  6,  the  “vertexes”  of  the  hexagon,  and 
the  six  connecting  lines  12,  23,  34,  45,  56,  61,  the  “sides”  of  the  hexagon.  The 
sides  12  and  45,  the  sides  23  and  56,  and  finally  34  and  61  are  called  the 
“opposite  sides.”  The  straight  line  on  which  the  three  points  of  intersection  of  the 
opposite  sides  lie  is  called  the  Pascal  line,  and  the  hexagon  is  called  the  Pascal 
hexagon.  In  a  somewhat  more  abbreviated  form  the  theorem  to  be  proved  can  be 
stated  as: 

The  three  points  of  intersection  of  a  Pascal  hexagon  lie  on  a  straight  line. 

This  fundamental  theorem  in  conic  section  theory  was  published  in  1640  by 
Blaise  Pascal  (1623-1662)  at  the  age  of  16  in  his  six-page  Essai  sur  les 


Coniques. 

There  are  a  number  of  proofs  of  the  Pascal  theorem.  The  following  projective 
proof  is  based  upon  the  two  theorems  of  Steiner: 

I.  The  points  of  a  conic  section  are  projected  from  pairs  of  themselves  by 
projective  pencils. 

II.  If  in  the  projectivity  between  two  ranges  of  points  the  point  of  intersection 
of  their  bases  corresponds  to  itself  the  ranges  are  perspective. 

Proof  of  1.  The  theorem  applies  most  directly  to  the  circle.  (In  circles  the 
designated  pencils  are  even  congruent.)  Now,  since  a  conic  section  is  the  central 
projection  of  a  circle,  and  since  in  this  projection  the  pencils  we  are  concerned 
with  appear  as  projections  of  projective  ray  pencils  in  a  circle,  we  need  only 
show  that  the  central  projection  of  a  pencil  on  a  plane  is  projective  with  respect 
to  the  pencil.  Now  this  is  the  case  according  to  Pappus’  theorem.  Specifically,  if 
a,  b,  c,  d  are  four  rays  lying  in  plane  E,  a',  b',  c',  d'  their  central  projections  on 
plane  E',  and  A,  B,  C,  D  the  points  of  intersection  of  the  ray  pairs  ( a ,  a'),  (b,  bj, 
(c,  cj,  and  (d,  dj  lying  on  the  line  of  intersection  of  the  two  planes,  then, 
according  to  Pappus, 

(, a'b'c'd ')  =  {ABCD)  and  (abed)  =  ( ABCD ), 


thus,  also 


(a'b'c'd')  =  (abed), 

i.e.,  the  pencil  and  the  pencil  projection  are  projective. 

The  proof  of  II.  is  in  No.  59. 

Now  to  prove  the  Pascal  theorem! 

Let  the  vertexes  of  the  hexagon  be  1,2,  3,  4,  5,  6.  According  to  I.,  the  rays 
from  the  centers  1  and  3  to  the  conic  section  points  2,  4,  5,  6  form  projective 
pencils;  thus  the  points  of  intersection  2',  4’,  5',  6'  and  2",  4",  5",  6”  of  these  rays 
with  the  straight  lines  54  and  56  form  projective  ranges.  Since  at  the  point  of 
intersection  5  of  their  bases  the  corresponding  range  elements  are  coincident  (5' 
=  5"),  the  ranges  are  perspective  according  to  II.,  and  consequently  the  lines 
2'2",  4'4",  and  6'6"  pass  through  one  point,  the  point  of  intersection  Z  of  the  lines 
4'4"  and  6'6",  i.e.,  the  lines  34  and  61.  In  other  words:  The  points  of  intersection 
of  the  opposite  sides  2'  (intersection  of  12  and  45),  2"  (intersection  of  23  and 
56),  and  Z  (intersection  of  34  and  61)  lie  on  one  straight  line,  the  Pascal  line  p  = 
2'Z2".  Q.E.D. 
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FIG.  64. 


The  converse  of  Pascal’s  theorem:  If  the  opposite  sides  of  a  hexagon  (of 
which  no  three  vertexes  lie  on  a  straight  line)  intersect  on  a  straight  line ,  the  six 
vertexes  lie  on  a  conic  section. 

Indirect  proof.  Let  the  conic  section  that  is  unequivocally  determined  by 
the  five  vertexes  1,  2,  3,  4,  5  touch  the  fifth  side  of  the  hexagon  56  at  6*. 
According  to  Pascal’s  theorem,  we  obtain  6*  by  drawing  the  Pascal  line  (as  the 
line  connecting  the  points  of  intersection  of  the  opposite  sides  12  and  45,  as  well 
as  23  and  56  =  56*),  causing  it  to  intersect  with  34  at  Z  and  determining  the 
point  of  intersection  (6*)  of  1Z  with  56*  =  56.  But  according  to  our  assumption, 
this  is  6,  so  that  6*  =  6. 

If  two  vertexes  of  a  Pascal  hexagon  coincide  once  or  twice  or  three  times, 
there  follow  the  corollaries  of  the  Pascal  theorem,  the  most  important  of  which 
we  will  now  give. 

I.  The  vertexes  5  and  6  coincide:  this  is  to  be  considered  as  meaning  that 
point  6  approaches  point  5  ever  more  closely  until  it  finally  coincides  with  it. 
This  transforms  the  chord  56  into  the  tangent  at  point  5  and  the  hexagon  is 
transformed  into  the  pentagon  1  2  3  4  5.  Pascal’s  theorem  then  assumes  the 
form: 

Corollary  1  (Figure  65):  In  every  pentagon  inscribed  in  a  conic  section  the 
points  of  intersection  of  two  pairs  of  nonadjacent  sides  and  the  point  oj 
intersection  of  the  fifth  side  with  the  tangent  passing  through  the  opposite  vertex 
lie  on  a  straight  line. 


FIG.  65. 


II.  The  vertexes  5  and  6  coincide  and  the  vertexes  2  and  3  coincide;  the 
hexagon  thus  becomes  a  tetragon  1  2  4  5.  Now  the  opposite  sides  of  the  tetragon 
12  and  45,  and  likewise  24  and  51,  and  the  tangents  at  the  opposite  vertexes  2 
and  5  intersect  each  other  on  a  straight  line. 


FIG.  66. 

Since  we  could  just  as  easily  choose  the  two  other  opposite  vertexes,  the  point  of 
intersection  of  the  tangents  at  these  vertexes  also  lies  on  the  Pascal  line.  We 
therefore  obtain  the  following 

Corollary  2  (Figure  66):  In  every  tetragon  inscribed  in  a  conic  section  all 
the  pairs  of  opposite  sides  and  tangents  to  the  pairs  of  opposite  vertexes 
intersect  on  a  straight  line. 


FIG.  67. 


III.  The  vertexes  1  and  2  coincide,  so  do  vertexes  3  and  4,  and  so  do  vertexes 
5  and  6;  the  hexagon  becomes  a  triangle,  and  we  obtain 

Corollary  3  (Figure  67):  In  every  triangle  inscribed  in  a  conic  section  the 
sides  intersect  with  the  tangents  to  the  opposite  vertexes  on  a  straight  line. 
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Brianchon’s  Hexagram  Theorem 


To  demonstrate  that  the  three  opposite  vertex  lines  of  a  hexagram 
circumscribed  about  a  conic  section  pass  through  a  point. 


A  hexagram  circumscribed  about  a  conic  section  consists  essentially  of  six 
tangents  I,  II,  III,  IV,  V,  VI  to  the  conic  section,  which  are  the  sides  of  the 
hexagram,  and  the  six  points  of  intersection  I  II,  II  III,  III  IV,  IV  V,  V  VI,  VII 
forming  the  vertexes  of  the  hexagram.  The  vertexes  III  and  IV  V,  the  vertexes  II 
III  and  V  VI,  and  the  vertexes  III  IV  and  VI  I  are  called  opposite  vertexes,  and 
the  lines  connecting  them  are  called  opposite  vertex  lines. 

The  point  through  which  the  three  opposite  vertex  lines  pass  is  called  the 
Brianchon  point  and  the  hexagram  the  Brianchon  hexagram.  The  theorem  to  be 
proved  can  be  stated  in  a  somewhat  shorter  form  as  follows. 

The  three  opposite  vertex  lines  of  a  Brianchon  hexagram  pass  through  a 
point. 

This  theorem,  which  is  as  important  in  the  theory  of  conic  sections  as  the 


Pascal  theorem,  was  published  in  1810  by  the  French  mathematician  Brianchon 
(1785-1864)  in  the  Journal  de  I’Ecole  Poly  technique. 

The  following  projective  proof  of  Brianchon’s  theorem  is  based  on  the  two 
theorems  of  Steiner  : 

I.  The  tangents  of  a  conic  section  cut  two  of  the  tangents  into  projective 
ranges  of  points. 

II.  If  in  the  projectivity  between  two  ray  pencils  the  line  joining  the  pencil 
centers  corresponds  to  itself  the  pencils  are  perspective. 

Proof  of  1.  We  first  prove  1.  for  a  circle.  For  this  purpose  let  us  consider  the 
following  structure:  1.  the  range  of  points  <r  through  which  a  moving  point  P  on 
the  circle  passes;  2.  the  pencil  39  of  the  rays  FP  that  run  from  the  fixed  circle 
point  F  to  the  moving  point  P;  3.  the  field  3  of  tangents  t  drawn  to  the  different 
positions  of  P;  4.  the  range  r  of  the  points  of  intersection  S  of  these  tangents  with 
the  fixed  circle  tangents  /  through  F;  5.  finally,  the  pencil  b  of  the  rays  MS  that 
run  from  the  center  point  M  of  the  circle  to  S.  Then  <h  ,  3,  and  3  are  projective  by 
definition,  3  and  b  are  projective  because  they  are  congruent  (every  ray  from  3 
is  perpendicular  to  the  corresponding  ray  from  b),  and  finally  t  and  b  are 
projective  because  they  are  perspective.  Consequently,  3  and  t  are  projective, 
l.e.: 

A  field  of  tangents  to  a  circle  is  projective  with  respect  to  the  range  of  points 
that  the  tangents  of  thefield  generate  on  an  arbitrary  fixed  tangent.  From  this  it 
follows  directly  that: 

The  tangents  of  a  circle  cut  two  of  them  into  projective  ranges  of points. 

We  will  now  prove  theorem  1.  for  a  conic  section.  The  conic  section  is  the 
central  projection  of  a  circle  in  which  its  tangents  are  perspectives  of  circle 
tangents.  In  this  projection  the  ranges  of  points  mentioned  appear  as  perspectives 
of  the  two  ranges  that  the  circle  tangents  generate  on  the  two  fixed  circle 
tangents,  which  correspond  to  the  chosen  conic  section  tangents  in  the  central 
projection.  Now,  since  the  latter  ranges  are  projective,  the  former  must  also  be. 

Proof  of  II.  is  given  in  No.  59. 

Now  for  the  proof  of  Brianchon’s  theorem! 

Let  the  sides  of  the  hexagram  be  I,  II,  III,  IV,  V,  VI.  According  to  auxiliary 
theorem  I.,  the  points  of  intersection  generated  on  tangents  I  and  III  by  II,  IV,  V, 
VI  form  projective  ranges  of  points,  and  consequently  the  junction  lines  IF,  IV', 
V',  VI',  and  II",  IV",  V",  VI"  of  these  points  with  the  points  ( centers )  V IV  and  V 
VI  form  projective  pencils.  Since  in  the  line  V  connecting  the  centers, 
corresponding  rays  {V  =  v'j  coincide,  the  pencils  are  perspective  according  to 


auxiliary  theorem  II.,  and  the  rays  IF  and  II",  IV'  and  IV",  and  VF  and  VI" 
intersect  on  one  straight  line,  the  axis  of  perspectivity,  the  junction  line  a  of  the 
points  IV'  IV"  and  VF  VI",  i.e.,  of  the  points  III  IV  and  VI I.  In  other  words:  The 
opposite  vertex  lines  IF  (from  I II  to  IV  V),  IF  (from  II  III  to  V  VI),  and  a  (from 
III  IV  to  VI I)  pass  through  one  point,  the  Brianchon  point.  Q.E.D. 


The  converse  of  Brianchon’S  theorem:  If  the  opposite  vertex  lines  of  a 
hexagram  (of  which  three  sides  do  not  pass  through  one  point)  pass  through  a 
point,  the  sides  of  the  hexagram  form  tangents  of  a  conic  section. 

Indirect  proof,  similar  to  the  proof  of  the  converse  of  Pascal’s  theorem  (No. 
61). 

If  two  sides  of  the  Brianchon  hexagram  coincide  once  or  twice  or  three  times, 
we  obtain  the  corollaries  of  the  Brianchon  theorem,  the  most  important  of  which 
we  will  here  mention. 

I.  The  sides  V  and  VI  coincide;  this  is  to  be  considered  as  a  situation  in  which 
side  VI  comes  closer  and  closer  to  side  V  and  finally  coincides  with  it.  The  point 
of  intersection  V  VI  then  becomes  the  point  of  tangency  of  the  tangent  V,  and  the 
hexagram  becomes  the  pentagram  I  II  III  IV  V.  Brianchon’s  theorem  then 
assumes  the  following  form: 

Corollary  1  (Figure  69):  In  every  pentagram  circumscribed  about  a  conic 
section  the  lines  joining  two  pairs  of  nonadjacent  vertexes  and  the  junction  line 
of  the  fifth  vertex  with  the  point  of  tangency  of  its  opposite  side  pass  through  one 
point. 


FIG.  69. 


II.  The  sides  V  and  VI  coincide,  and  the  sides  II  and  III  coincide;  here  the 
hexagram  becomes  the  tetragram  I  II  IV  V.  Now  the  junction  lines  of  the 
opposite  vertexes  III  and  IV  V,  as  well  as  those  of  II  IV  and  V  I,  and  also  the 
junction  lines  of  the  tangency  points  of  II  and  V  pass  through  one  point.  Since 
we  could  as  easily  select  the  tangency  points  of  the  opposite  sides  I  and  IV,  their 
junction  line  also  passes  through  the  Brianchon  point.  Consequently,  we  obtain 


Corollary  2  (Figure  70):  In  every  tetragram  circumscribed  about  a  conic 
section  the  two  diagonals  and  the  two  tangency  chords  of  the  opposite  sides  pass 
through  one  point. 


FIG.  71. 


III.  The  sides  I  and  II  coincide,  the  sides  III  and  IV  coincide,  and  the  sides  V 
and  VI  also  coincide;  the  hexagram  becomes  a  trigram,  and  we  obtain 

Corollary  3  (Figure  71):  In  every  triangle  circumscribed  about  a  conic 
section  the  lines  connecting  the  vertexes  with  the  tangency  points  of  the  opposite 
sides  pass  through  one  point. 
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Desargues’  Involution  Theorem 


The  points  of  intersection  of  a  line  with  the  three  pairs  of  opposite  sides  of  a 
complete  tetragon *  and  a  conic  section  circumscribed  about  this  tetragon  form 
four  point  pairs  of  an  involution.  The  lines  joining  a  point  with  the  three  pairs  oj 
opposite  vertexes  of  a  complete  tetragram*  and  the  tangents  drawn  from  the 
point  to  a  conic  section  inscribed  in  the  tetragram  form  four  ray  pairs  of  an 
involution. 


It  is  here  assumed  that  the  line  does  not  pass  through  a  corner  of  the  tetragon 
and  that  the  point  does  not  lie  on  a  side  of  the  tetragram. 

This  double  theorem  was  formulated  and  proved  in  1639  by  Desargues  (No. 
59)  in  his  major  work  on  conic  sections.  The  work  bears  the  strange  title 
Brouillon-Proj et  d’une  atteinte  aux  evenements  des  rencontres  d’un  cone  avec  un 
plan,  or  approximately  in  English  “First  Draft  of  a  Projected  Essay  on  the 


Phenomena  Arising  from  the  Intersection  of  a  Cone  with  a  Plane.” 

Desargues  was  the  source  of  the  concept  of  involution  and  of  an  amazing 
series  of  involution  theorems  as  well,  so  that  it  seems  appropriate  at  this  point  to 
take  up  briefly  for  readers  unfamiliar  with  it  the  most  significant  properties  of 
involution. 

In  a  conjective  projectivity  (No.  59)  between  two  homologous  structures  I 
and  II  each  element  of  a  common  base  can  be  assigned  to  I  as  well  as  II.  Now,  if 
there  are  two  elements  A  and  B  of  the  base  such  that  to  the  element  A  of  I  there 
corresponds  the  element  B  of  II  and  simultaneously  to  the  element  B  of  I  there 
corresponds  the  element  A  of  II,  we  say  that  the  elements  A  and  B  are  conjugate 
(to  each  other)  or  correspond  to  each  other  in  double  fashion. 

Let  us  consider  in  addition  to  the  conjugate  point  pair  (A,  B)  another  arbitrary 
pair  of  homologous  elements:  P  from  I  and  Q  from  II.  From  the  equation 

(ABPQ)  -  ( BAQP ) 

it  then  follows  that  to  the  element  Q  from  I  there  also  corresponds  the  element  P 
from  II,  i.e.,  P  and  Q  are  also  conjugate.  Thus,  if  one  pair  of  homologous 
elements  in  a  conjective  projectivity  is  composed  of  conjugate  elements,  then 
every  pair  is  composed  of  conjugate  elements. 

A  conjective  projectivity  in  which  every  two  homologous  elements  are 
conjugate  is  called  an  involution  or  an  involutional  projectivity.  Every  pair  of 
conjugate  elements  is  called  for  short  an  element  pair  of  the  involution. 


Since  a  projectivity  is  fixed  by  three  elements  of  one  structure  and  the 
homologous  elements  of  the  other,  an  involution  is  determined  by  two  pairs  A,  A' 


and  B,  B'  of  conjugate  elements  insofar  as  the  elements  A,  A',  B  of  the  one 
structure  correspond  to  the  elements  A',  A,  B'  of  the  other. 

Construction  of  an  involution,  i.e.,  construction  of  an  element  P' 
corresponding  to  an  arbitrary  element  P,  is  most  effectively  accomplished  by 
means  of  Desargues’  involution  theorem  (where  conic  sections  do  not  enter  into 
the  picture).  Let  us  say,  for  example,  that  we  are  concerned  with  the  involution 
of  two  ranges  of  points.  Let  (A,  A')  and  (B,  B ')  be  the  given  point  pairs  of  the 
involution,  C  an  additional  given  point  of  the  base  2  and  C’  the  homolog  of  C 
we  are  looking  for.  We  draw  through  A,  B ,  C  three  lines  that  form  a  triangle  1  2  3 
(A  on  23,  B  on  31,  C  on  12),  connect  A’  with  1,  B'  with  2,  and  the  point  of 
intersection  4  of  these  connecting  lines  with  3.  Then  34  touches  the  base  at  C' . 
(The  opposite  side  pairs  23  and  14,  31  and  24,  12  and  34  of  the  tetragon  12  3  4 
cut  2  at  the  point  pairs  (A,  A'),  ( B ,  B'),  and  (C,  C)  of  the  Desargues  involution.) 
The  construction  of  the  involution  between  two  ray  pencils  is*  carried  out  in  a 
very  similar  fashion. 


FIG.  73. 

We  will  now  consider  the  important  case  of  the  involution  on  a  circle.  Let  (A, 
A')  and  (B,  B’)  be  two  point  pairs  of  an  involution  between  two  ranges  of  points 
of  a  circle  (Figure  73). 

We  connect  the  points  of  both  sets  with  the  circle  points  A  and  A’.  We  thereby 
obtain  two  projective  ray  pencils  in  which  the  rays  AA\  AB,  AB'  of  the  first 
pencil  correspond  to  the  rays  A'A,  A'B',  A'B  of  the  second  pencil.  Since  the 
junction  line  AA'  of  the  pencil  centers  corresponds  to  itself,  the  pencils  are 
perspective  (No.  59).  The  axis  of  perspectivity  is  the  junction  line  of  the  points 
of  intersection  Z  of  AB  and  A'B'  and  O  of  AB'  and  BA'. 


In  order  to  find  the  homolog  C'  in  the  involution  of  an  arbitrary  point  C,  we 
cause  AC  and  OZ  to  intersect  at  Y  and  connect  Y  with  A';  the  connecting  line 
touches  the  circle  at  C . 

Since  we  can  just  as  well  undertake  the  whole  consideration  with  the  pencil 
centers  B  and  B'  (instead  of  A  and  A'),  we  also  obtain  C  when  we  cause  BC  and 
OZ  to  intersect  and  connect  the  point  of  intersection  X  with  B'. 

Since  the  homologous  sides  (bearing  the  same  letter  designation)  of  triangles 
ABC  and  A'B'C'  intersect  on  a  straight  line  (XYZ),  then,  according  to  Desargues’ 
homology  theorem  (No.  59),  the  junction  lines  AA',  BB',  and  CC  of  the 
homologous  vertexes  pass  through  one  point  S.  If  we  then  draw  through  S  any 
secant,  this  secant  cuts  the  circle  at  two  conjugate  points  of  the  involution. 

The  result  of  our  consideration  is  the  theorem: 

The  lines  joining  the  conjugate  points  of  an  involution  on  a  circle  pass 
through  a  fixed  point. 

And  conversely: 

A  secant  rotated  about  a  fixed  point  cuts  a  circle  at  the  point  pairs  of  an 
involution. 

In  quite  similar  fashion  the  following  theorem  is  proved: 

The  points  of  intersection  of  conjugate  tangents  of  an  involution  on  a  circle 
lie  on  a  straight  line. 

And  conversely: 

If  a  point  moves  on  a  line,  the  tangents  drawn  from  this  point  to  a  circle 
generate  an  involution  on  the  circle  (Figure  74). 


Moreover,  since  every  conic  section  is  the  central  projection  of  a  circle,  and 


projectivity,  and  thus  also  involution,  between  two  structures  is  not  annulled  by 
projection  of  these  structures  (Pappus’  theorem,  No.  59),  the  two  just  stated 
theorems  are  valid  for  conic  sections  as  well: 

Involution  on  a  conic  section:  The  lines  connecting  conjugate  points  oj 
an  involution  on  a  conic  section  pass  through  a  fixed  point. 

The  points  of  intersection  of  conjugate  tangents  of  an  involution  on  a  conic 
section  lie  on  a  fixed  straight  line. 

And  conversely: 

A  secant  rotated  about  a  fixed  point  cuts  a  circle  at  the  point  pairs  of  an 
involution.  The  tangents  from  a  point  moving  along  a  fixed  straight  line  to  a 
conic  section  are  tangent  pairs  of  an  involution 


The  proof  of  Desargues  ’  involution  theorem  is  based  on  the  theorems  : 

The  points  of  a  conic  section  are  projected  from  pairs  fo  themselves  by 
projective  pencils  (No.  61). 

The  tangents  of  a  conic  section  cut  two  of  the  tangents  into  projective  ranges 
of  points  (No.  62). 


Let  1  2  3  4  be  an  inscribed 
tetragon.  Let  the  line  g  cut  the 
sides  23,  31,  12  at  A,  B,  C,  the 
opposite  sides  14,  24,  34  at  A',  B 
',  C',  the  conic  section  at  S  and  S 

f 

We  connect  the  conic  section 
points  2,  3,  S,  S'  with  1  and  4  and 
obtain  two  projective  pencils 
with  the  centers  1  and  4,  so  that 
the  projections  12  13  1 S  15"  and 
42  43  4S  4 S'  are  projective. 


Let  1  11  111  IV  be  a 

circumscribed  tetragram.  Let  the 
lines  connecting  the  point  P  with 
the  vertexes  II  III,  III  I,  I  II  be  a, 
b,  c,  with  the  opposite  angles  1 
IV,  II  IV,  III  IV  a',  b'  c'.  Let  the 
tangents  from  P  to  the  conic 
section  be  t  and  t'.  We  cut  the 
conic  section  tangents  II,  III,  t,  t' 
with  I  and  IV  and  obtain  two 
projective  ranges  of  points  on  the 
bases  I  and  IV,  so  that  the 
projections  I  II  I  III  It  It'  and  IV 
II IV  III  IVt  IW  are  projective. 
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FIG.  75. 

We  cause  these  pencils  to 
intersect  with  g  and  obtain  two 
conjective  projective  ranges  of 
points  with  the  base  g  in  which 

CBSS'  x  B'C’SS’, 
i.e., 

{CBSS')  =  {B'CSS’). 


FIG.  76. 

We  project  these  ranges  from 
P  and  obtain  two  conjective 
projective  ray  pencils  with  the 
center  P  in  which 

cbtf  X  b'c’tf, 

i.e., 

(■ cbtf )  -  {b'c'W). 


We  now  switch  the  first  two  terms  with  each  other  and  the  second  two  terms 
with  each  other  on  the  right-hand  side  and  obtain 


{CBSS')  =  {C'B'S'S), 

so  that 

CBSS'  x  C’B’S'S. 

In  this  projection  there  are 
two  conjugate  points  S  and  S'. 
Consequently,  the  projectivity  is 
an  involution,  and  the  points  B 
and  B',  as  well  as  the  points  C 
and  C',  are  conjugate. 

If  we  connect  the  conic 


{cbtf)  =  (c'b't't), 

so  that 

cblt'  x  c’b't't. 

In  this  projection  there  are 
two  conjugate  rays  t  and  t' . 
Consequently,  the  projectivity  is 
an  involution,  and  the  rays  b  and 
b',  as  well  as  the  rays  c  and  c', 
are  conjugate. 

If  we  cut  the  conic  section 


section  points  3,  1,  S,  S'  with  2 
and  4,  and  undertake  the  same 
considerations,  we  find  that 


tangents  III,  I,  t,  t'  with  II  and  IV, 
and  undertake  the  same 
considerations,  we  find  that 


(/1CXS')  =  (^'C'S'S),  (actf)  =  (a'c't't), 


so  that  in  the  involution  defined 
by  the  point  pairs  ( S ,  S')  and  (C, 
C)  the  points  A  and  A'  are  also 
conjugate. 

Accordingly,  (A,  A'),  {B,  B'), 
(C,  C),  and  ( S ,  S' )  are  point  pairs 
of  an  involution. 

Thus  Desargues’  theorem  is  proved. 


so  that  in  the  involution  defined 
by  the  ray  pairs  (t,  t')  and  (c,  c') 
the  rays  a  and  a'  are  also 
conjugate. 

Accordingly,  {a,  a'),  (b,  b'), 
(c,  c%  and  (t,  t')  are  ray  pairs  of 
an  involution. 


Special  Cases 


We  maintain  fixed  the  conic 
section,  the  three  vertexes  1,  2, 
3,  and  the  straight  line  g;  we 
allow  the  vertex  4,  on  the  other 
hand,  to  travel  on  the  conic 
section  toward  the  point  3.  The 
secant  34  then  comes  closer  and 
closer  to  the  tangent  at  3,  while 
at  the  same  time  point  A'  comes 
closer  and  closer  to  point  B  and 
point  Br  closer  and  closer  to 
point  A.  When  4  reaches  3,  43 
becomes  a  tangent  through  3, 
and  A’  coincides  with  B  and  B' 
with  A. 

Consequently,  we  obtain 


We  maintain  fixed  the  conic 
section,  the  three  sides  I,  II,  III, 
and  the  point  P;  we  allow  the 
side  IV  to  roll  along  the  conic 
section  into  position  III.  The 
vertex  III  IV  then  comes  closer 
and  closer  to  the  point  of 
tangency  of  the  tangent  III,  while 
at  the  same  time  the  ray  a'  comes 
closer  and  closer  to  the  ray  b  and 
the  ray  b'  comes  closer  and  closer 
to  the  ray  a.  When  IV  coincides 
with  III,  IV  III  becomes  the 
tangency  point  of  III,  and  a' 
coincides  with  b  and  b'  with  a. 


Corollary  1 


The  points  of  intersection  of  a 
straight  line:  1.  with  a  conic 


1.  The  tangents  from  a  point 
to  a  conic  section,  2.  the  lines 


section,  2.  with  two  sides  of  a 
triangle  inscribed  in  a  conic 
section,  3.  with  the  third  side  of 
the  triangle  and  the  conic 
section  tangent  passing  through 
its  opposite  vertex  are  three 
point  pairs  of  an  involution. 


If  we  maintain  fixed  the  conic 
section  in  the  figure  obtained, 
the  line  g,  and  the  vertexes  1  and 
3,  and  let  2  travel  toward  1,  then 
12  approaches  more  and  more 
closely  the  tangent  through  1  and 
A  the  point  A'.  When  2  reaches  1, 
12  becomes  the  tangent  through 
1,  A  coincides  with  A',  and  C 
falls  on  the  tangent  through  1 . 


joining  the  point  with  two 
vertexes  of  a  trigram 
circumscribed  about  a  conic 
section,  3.  the  lines  joining  the 
point  with  the  third  vertex  of  the 
trigram  and  the  point  of  tangency 
on  its  opposite  side  are  three  ray 
pairs  of  an  involution. 


If  we  maintain  fixed  the  conic 
section  in  the  figure  obtained,  the 
point  P,  and  the  sides  I  and  III, 
and  let  II  roll  toward  I,  the  point 
III  approaches  more  and  more 
closely  the  tangency  point  of  I 
and  a  the  ray  a'.  When  II  reaches 
I,  III  becomes  the  tangency  point 
of  I,  a  coincides  with  a'  and  c 
passes  through  the  tangency 
point  of  I. 


FIG.  80. 


FIG.  79. 


Thus,  we  have 


Corollary  2 


tangents  and  their  corresponding  tangency 


Given  a  conic  section  with  two 
chord  (Figures  79  and  80): 

If  the  points  of  intersection  of 
an  arbitrary  line  with  the  conic 
section  are  chosen  as  the  first 
pair,  the  points  of  intersection 
with  the  given  tangents  as  the 
second  pair  of  an  involution,  the 
point  of  intersection  of  the 
tangency  chord  with  the  line  is  a 
double  point  of  the  involution. 


If  the  tangents  drawn  to  a 
conic  section  from  an  arbitrary 
point  are  chosen  as  the  first  pair, 
and  the  rays  from  the  point  to  the 
ends  of  the  tangency  chord  as  the 
second  pair  of  an  involution,  the 
line  joining  the  point  with  the 
point  of  intersection  of  the  given 
tangents  is  a  double  ray  of  the 
involution. 


Note.  Through  the  four  corners  of  a  tetragon  there  pass  an  infinite  number 
of  conic  sections,  which  form  a  so-called  conic  section  pencil.  The  (complete) 
tetragon  is  called  a  fundamental  tetragon  in  this  context. 

Similarly,  there  are  an  infinite  number  of  conic  sections  that  are  tangent  to  the 
four  sides  of  a  tetragram;  they  form  a  so-called  field  of  conic  sections.  The 
(complete)  tetragram  in  this  context  is  called  a  fundamental  tetragram. 

Since  Desargues’  theorem  applies  to  every  one  of  these  conic  sections,  we 
can  state  the  theorem  in  the  following  manner,  which  is  its  most  general  and 
shortest  form. 


Desargues’  involution  theorem  :  The  intersection  point  pairs  of  a  line  with 
the  conic  sections  of  a  pencil  are  point  pairs  of  an  involution. 

The  tangent  pairs  from  a  point  to  the  conic  sections  of  a  field  are  ray  pairs  of 
an  involution. 

Here  the  opposite  side  pairs  of  the  fundamental  tetragon  are  to  be  considered 
as  (degenerate)  conic  sections  of  the  pencil,  and  the  opposite  vertex  pairs  of  the 
fundamental  tetragram  as  (degenerate)  conic  sections  of  the  field. 
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A  Conic  Section  from  Five  Elements 


To  draw  a  conic  section  of  which  five  elements— points  and  tangents — are 
known. 


In  the  solution  of  this  fundamental  problem  we  distinguish  three  cases: 

I.  the  five  elements  are  of  the  same  type; 

II.  four  elements  are  of  the  same  type,  but  the  fifth  is  of  the  other; 

III.  three  elements  are  of  one  type,  two  are  of  the  other. 

In  the  following  we  will  designate  the  conic  section  as 


I.  To  draw  a  conic  section 
from  five  points. 

This  problem  is  commonly 
solved  by  means  of  Pascal’s 
theorem. 

We  number  the  points  in  an 
arbitrary  sequence  from  1  to  5 
and  designate  as  6  the  unknown 
point  of  intersection  of  an 
arbitrary  line  0  =  56,  passing 
through  5,  with  ft.  We  then  draw 
the  Pascal  line  p  of  the  hexagon 
1  2  3  4  5  6  as  the  line  connecting 
the  point  of  intersection  of  the 
opposite  sides  12  and  45  with  the 
point  of  intersection  of  the 
opposite  sides  23  and  56  =  g. 


I.  To  draw  a  conic  section 
from  five  tangents. 

This  problem  is  commonly 
solved  by  means  of  Brianchon’s 
theorem. 

We  number  the  tangents  in  an 
arbitrary  equence  from  I  to  V  and 
designate  as  VI  the  unknown 
tangent  drawn  to  ft  from  an 
arbitrary  point  P  =  V  VI  of 
tangent  V.  We  then  draw  the 
Brianchon  point  B  of  the 
hexagram  III  III  IV  V  VI  as  the 
point  of  intersection  of  the  line 
connecting  the  opposite  vertexes 
III  and  IV  V  with  the  line 
connecting  the  opposite  vertexes 
II  III  and  V  VI  =  P. 


The  line  joining  the  point  of 
intersection  of  the  two  lines  34 
and  p  with  the  vertex  1  cuts  0  (= 
56)  at  the  sought-for  point  6. 

By  repeating  the  construction 
with  another  line  0  we  can  obtain 
as  many  points  of  K  as  we 
desire. 

In  order  to  draw  the  tangent 
to  £  at  one  of  the  five  known 
points  1,  2,  3,  4,  5  of  a  conic 
section,  let  us  say  at  5,  we  make 
use  of  the  first  corollary  to 
Pascal’s  theorem. 

We  draw  the  point  of 
intersection  of  the  two  sides  51 
and  43,  also  the  point  of 
intersection  of  the  sides  54  and 
12,  and  allow  the  line  p 
connecting  these  two  points  with 
the  side  23  to  intersect.  The  line 
connecting  the  resulting  point  of 
intersection  with  the  vertex  5  is 
the  sought-for  tangent  at  5. 

II.  To  draw  a  conic  section  of 
which  four  points  1,  2,  3,  4  and 
one  tangent  t  are  given. 

First  case:  The  tangent  t 
passes  through  one  of  the  given 
points,  for  example,  through  4. 

Let  us  consider  the  tangent  t 
as  the  line  connecting  two 
infinitely  close  conic  section 
points  4  and  5,  so  that  t  =  45,  and 
let  us  designate  as  6  the  point  of 
intersection  of  ft  with  an 


The  point  of  intersection  of 
the  line  connecting  the  two 
points  III  IV  and  B  with  the  side  I 
is  a  second  point  of  the  sought- 
for  tangent  VI. 

By  repeating  the  construction 
with  other  points  P  we  can  obtain 
as  many  tangents  of  ft  as  we 
desire. 

To  draw  on  one  of  five  known 
tangents  I,  II,  III,  IV,  V  to  a  conic 
section,  let  us  say  on  V,  the  point 
of  tangency  with  ft,  we  make  use 
of  the  first  corollary  to 
Brianchon’s  theorem. 

We  draw  the  line  connecting 
the  two  vertexes  V  I  and  IV  III 
and  the  line  connecting  the  two 
vertexes  V  IV  and  III,  and 
connect  the  point  of  intersection 
B  of  the  two  lines  with  the  vertex 
II  III.  This  new  junction  line 
meets  the  tangent  V  at  the 
sought-for  point  of  tangency. 

II.  To  draw  a  conic  section  of 
which  four  tangents  I,  II,  III,  IV 
and  one  point  P  are  given. 

First  Case:  The  point  P  lies 
on  one  of  the  given  tangents,  for 
example,  on  IV. 

Let  us  consider  the  point  P  as 
the  point  of  intersection  of  two 
infinitely  close  conic  section 
tangents  IV  and  V,  so  that  P  =  IV 
V,  and  let  us  designate  as  VI  a 
second  tangent  from  an  arbitrary 


arbitrary  line  x  starting  from  1, 
so  that  x  =  16.  We  then  draw  the 
Pascal  line  p  of  the  hexagon  1  2 
3  4  5  6  as  the  line  connecting  the 
point  of  intersection  of  opposite 
sides  12  and  45  =  t  with  the  point 
of  intersection  of  the  opposite 
sides  34  and  61  =  x.  The  line 
connecting  the  point  of 
intersection  of  the  lines  p  and  23 
with  the  vertex  4  meets  0  at  the 
sought-for  point  6. 

We  now  have  five  known 
points  of  ft,  and  the  problem  is 
reduced  to  I. 

Second  case  :  The  tangent  t 
does  not  pass  through  any  of  the 
given  points. 

To  solve  this  problem  we  use 
the  Desargues’  involution 
theorem  (No.  63),  taking  t  as  the 
involution  base.  We  determine 
the  points  of  intersection,  let  us 
say  A,  A',  B,  B'  of  the  sides  12, 
34,  23,  41  of  the  tetragon  12  3  4 
with  t  and  draw  a  double  point  of 
the  involution  determined  on  t 
by  the  two  point  pairs  (A,  A')  and 
(B,  B’)\  this  is  the  point  of 
tangency  of  the  tangent  t. 

Now  five  points  of  ft  are 
known  and  the  problem  is 
reduced  to  1. 


point  X  of  I  to  ft,  so  thatX=  I  VI. 
We  then  draw  the  Brianchon 
point  B  of  the  hexagram  I  II  III 
IV  V  VI  as  the  point  of 
intersection  of  the  line 

connecting  the  opposite  vertexes 
III  and  IV  V  =  P  and  the  line 
connecting  the  opposite  vertexes 
III  IV  and  VII  =  X.  The  point  of 

intersection  of  the  line 

connecting  the  points  B  and  II  III 
with  the  side  IV  is  a  second  point 
of  the  sought-for  tangent  VI. 

We  now  have  five  known 

tangents  of  ft  and  the  problem  is 
thereby  reduced  to  I. 

Second  case  :  The  point  P 
does  not  lie  on  any  of  the  given 
tangents. 

To  solve  this  problem  we 
make  use  of  Desargues’ 
involution  theorem  (No.  63), 
taking  P  as  the  involution  base. 
We  determine  the  junction  lines 
a,  a',  b,  b'  connecting  the 
vertexes  III,  III  IV,  II  III,  IV  1  of 
the  tetragram  III  III  IV  with  P 
and  construct  the  double  ray  of 
the  involution  determined  on  P 
by  the  two  ray  pairs  {a,  a')  and 
( b ,  b ');  this  is  the  conic  section 
tangent  passing  through  P. 

We  now  have  five  known 
tangents  of  ft  and  the  problem 
thus  reduces  to  I. 


The  second  case  of  II.  has  two  solutions  if  the  involution  has  two  double 


elements  and  no  solution  if  the  involution  has  no  double  elements. 


111.  To  draw  a  conic  section  of 
which  three  points  A,  B,  C  and 
two  tangents  d  and  e  are  given. 

First  case:  d  passes  through 
A,  and  e  through  B. 

We  draw  the  point  of 
intersection  S  of  an  arbitrary  line 
g  originating  at  A  with  ft. 

For  our  purpose  we  construct 
the  Pascal  line  p  of  the  hexagon 
1  2  3  4  5  6  of  which  the  vertexes 
1  and  2  coincide  with  A,  the 
vertexes  3  and  4  with  B ,  the 
vertex  5  with  C,  and  the  vertex  6 
with  S,  the  sides  12  and  34  being 
represented  by  the  tangents  d 
and  e,  respectively,  p  is  the  line 
connecting  the  point  of 
intersection  of  the  sides  12  =  d 
and  45  =  BC  with  the  point  of 
intersection  of  the  sides  34  =  e 
and  61  =  g.  The  line  connecting 
the  point  of  intersection  of  the 
lines  p  and  23  =  AB  with  the 
vertex  5  =  C  meets  g  at  the 
sought-for  conic  section  point  S. 

In  the  same  way  we  draw  a 
fifth  point  of  ft  and  thus  reduce 
the  problem  to  I. 

Second  case:  d  passes 
through  A,  and  e  does  not  pass 
through  any  of  the  given  points. 


III.  To  draw  a  conic  section  of 
which  three  tangents  a,  b,  c  and 
two  points  D  and  E  are  given. 

First  case:  D  lies  on  a,  and  E 
on  b. 

We  draw  the  (second)  tangent 
t  from  an  arbitrary  point  P  of 
tangent  a  to  ft. 

For  our  purpose  we  construct 
the  Brianchon  point  B  of  the 
hexagram  I  II  III  IV  V  VI  of 
which  the  sides  I  and  II  coincide 
with  a,  the  sides  III  and  IV  with 
b,  the  side  V  with  c,  and  the  side 
VI  with  t,  the  vertexes  I II  and  III 
IV  being  represented  by  the 
points  D  and  E,  respectively.  B  is 
the  point  of  intersection  of  the 
line  connecting  the  vertexes  I II  = 
D  and  IV  V  =  be  and  the  line 
connecting  the  vertexes  III  IV  = 
E  and  VI  I  =  P.  The  point  of 
intersection  of  the  line 
connecting  points  B  and  II  III  = 
ab  with  the  side  V  =  c  is  a  second 
point  of  the  sought-for  tangent  t. 

In  the  same  way  we  draw  a 
fifth  tangent  of  ft  and  thereby 
reduce  the  problem  to  I. 

Second  case:  D  lies  on  a, 
and  E  does  not  lie  on  any  of  the 
given  tangents. 


We  solve  this  case  with  the  second  corollary  to  Desargues’  involution 
theorem. 


We  determine  the  points  of 


We  determine  the  connecting 


intersection  D  and  E  of  the  line 
BC  with  d  and  e  and  construct  a 
double  point  of  the  involution 
defined  by  the  point  pairs  ( B ,  C) 
and  ( D ,  E).  Its  junction  line  with 
A  passes  through  the  point  of 
tangency  of  e. 


lines  d  and  e  joining  the  point  be 
with  D  and  E  and  draw  a  double 
ray  of  the  involution  determined 
by  the  ray  pairs  (b,  c )  and  (d,  e). 
Its  point  of  intersection  with  a 
lies  on  the  tangent  passing 
through  E\  this  tangent  is  thus 
determined. 


The  problem  is  now  reduced  to  the  preceding  case. 


Third  case:  Neither  of  the 
two  tangents  passes  through  any 
of  the  given  points. 


Third  case:  Neither  of  the 
two  points  lies  on  any  of  the 
given  tangents. 


In  this  case  also  the  solution  is  based  on  the  second  corollary  to  Desargues’ 
involution  theorem. 


We  designate  the  points  of 
intersection  of  BC  with  d  and  e 
as  D  and  E  and  determine  a 
double  point  P  of  the  involution 
defined  by  the  point  pairs  ( B ,  C) 
and  ( D ,  E).  It  lies  on  the 
tangency  chord  of  the  tangents  d 
and  e. 

We  designate  the  points  of 
intersection  of  CA  with  d  and  e 
as  D'  and  E'  and  draw  a  double 
point  F  of  the  involution 
determined  by  the  point  pairs  (C, 
A)  and  ( D ,  E').  This  double  point 
also  lies  on  the  tangency  chord 
of  the  tangents  d  and  e. 

The  line  joining  the  two 
double  points  P  and  P’  is  thus 
the  tangency  chord  we  have 
mentioned  and  meets  the 
tangents  d  and  e  at  their 
tangency  points. 


We  designate  the  lines  joining 
be  with  D  and  E  as  d  and  e  and 
determine  a  double  ray  s  of  the 
involution  determined  by  the  ray 
pairs  (b,  c )  and  (d,  e).  It  passes 
through  the  point  of  intersection 
of  the  tangents  drawn  through  D 
and  E. 

We  designate  the  lines  joining 
ca  with  D  and  E  as  d'  and  e'  and 
draw  a  double  ray  s'  of  the 
involution  determined  by  the  ray 
pairs  ( c ,  a)  and  (d\  e').  This 
double  ray  also  passes  through 
the  point  of  intersection  of  the 
tangents  through  D  and  E. 

The  point  of  intersection  of 
the  two  double  rays  s  and  s'  is 
thus  the  tangent  intersection 
point  that  was  mentioned  before 
and  the  lines  joining  it  to  D  and  E 
are  the  tangents  passing  through 


D  and  E. 


We  now  know  five  points  of  n  We  now  have  five  tangents  of 

and  thus  return  to  I.  ft  and  thus  return  to  I. 

This  last  problem  admits  of  a  solution  only  when  each  of  the  two  designated 
involutions  has  double  elements.  And  since  we  can  connect  each  of  the  two 
double  elements  of  one  of  the  involutions  with  each  of  the  double  elements  of 
the  other,  we  obtain  four  possible  tangency  chords  and  tangent  intersection 
points,  respectively,  and  thus  four  different  conic  sections. 


65 


A  Conic  Section  and  a  Straight  Line 


To  draw  the  points  of  intersection  of  a  given  straight  line  with  a  conic  section 
of  which  five  elements — points  and  tangents — are  known. 


In  the  solution  of  this  problem  we  may  assume,  in  view  of  No.  64,  that  five 
points  of  the  conic  section  are  known.  The  solution  is  then  based  on  the  theorem: 
The  points  of  a  conic  section  are  projected  from  pairs  of  themselves  by  projective 
pencils  (No.  61)  and  on  Steiner’s  double  element  construction  (No.  60). 

Let  the  given  line  be  called  0,  the  given  points  of  the  conic  section  A,  B ,  C,  D, 
E.  We  can  think  of  the  points  of  the  conic  section  as  projected  from  D  and  E  by 
the  two  projective  pencils  I  and  II.  These  pencils  cut  0  into  the  two  projective 
ranges  of  points  1  and  2.  The  points  of  intersection  S  and  T  of  0  with  the  conic 
section  are  the  double  elements  of  the  projectivity  1  2.  This  projectivity  is, 

however,  determined  by  the  points  of  intersection  A h  B2  Cx  of  the  rays  DA,  DB, 

DC  with  0  and  the  homologous  points  of  intersection  A2,  B2,  C2  of  the  rays  EA, 
EB,  EC  with  0 . 

We  therefore  draw  according  to  Steiner  the  double  elements  of  the 
projectivity  defined  on  0  by  the  homologous  point  triplets  (A  h  B2,  Cj)  and  (A2, 
B2,  C2);  they  are  the  points  of  intersection  we  are  looking  for. 
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A  Conic  Section  and  a  Point 


To  draw  the  tangents  from  a  given  point  to  a  conic  section  of  which  five 
elements — points  and  tangents — are  known. 


In  view  of  the  considerations  of  No.  64,  we  may  assume  the  given  conic 


section  elements  to  be  tangents. 

The  solution  to  this  problem  is  based  upon  the  theorem:  The  tangents  of  a 
conic  section  mark  off projective  ranges  of points  on  two  of  the  tangents  (No.  62) 
and  on  Steiner’s  double  element  construction  (No.  60). 

Let  the  given  point  be  P,  the  given  tangents  a,  b,  c,  d,  e.  Let  us  consider  the 
tangents  of  the  conic  section  as  intersecting  with  d  and  e,  so  that  we  obtain  on  d 
and  e  the  projective  ranges  1  and  2  in  which  the  points  of  intersection  Ax,  B2,  Cx 
of  the  tangents  a ,  b,  c  with  d  and  the  points  of  intersection  A2,  B2,  C2  of  the 

tangents  a,  b,  c  with  e  are  homologous  elements.  The  reflections  of  these  ranges 
of  points  on  P  thus  form  two  projective  ray  pencils  1  and  11.  The  (conjective) 
projectivity  is  determined  by  the  lines  ax  bh  q  connecting  the  points  of 
intersection  Ah  Bh  Cx  to  P  and  the  homologous  connecting  lines  axx,  bxx,  cxx 
joining  the  points  of  intersection  A2,  B2 ,  C2  to  P.  Since  each  of  the  two  tangents  s 

and  t  from  P  to  the  conic  section  cuts  1  and  2  into  homologous  elements,  s  and  t 
are  therefore  the  double  elements  of  the  projectivity  I  11. 

We  thus  draw  according  to  Steiner  the  double  elements  of  the  conjective 
projectivity  determined  by  the  homologous  ray  triplets  (q,  bx,  Q)  and  (%,  bu, 
cn);  they  are  the  sought-for  tangents. 


*  If  a  circular  disc  rolls  along  the  circumference  of  a  fixed  circle  (without  sliding),  a  marked  point  on  the 
circumference  of  the  rolling  disc  (the  “rolling  circle”)  describes  an  epicycloid  when  the  disc  rolls  along  the 
outside  of  the  fixed  circle  and  a  hypocycloid  when  the  disc  rolls  along  the  inside. 

*  From  the  triangle  with  sides  n,  r  and  the  line  w  joining  the  end  points  of  n  and  r  lying  on  the  x-axis,  we 

obtain  cos  i//  =  (jA  +  A  -  \\r)l2nr.  If  we  express  the  numerator  of  this  fraction  entirely  in  terms  of  x,  thus 

expressing  n  by  y  +  u  =  2 px  -  qx  +  (p  -  qx)  ,  r  by  ex  +  k,  and  w  by  (x  -  k)  +  u  =  e  x  +  ks,  and  combine, 
the  numerator  then  becomes  equal  to  2 p{ex  +  k)  =  2 pr  and  cos  t//  becomes  2prt2nr  =  pin. 

*  A  complete  tetragon  (tetragram)  consists  essentially  of  four  points  (lines)  1,  2,  3,  4  and  their  six 
connecting  lines  (points  of  intersection)  23,  14,31,24,  12,  34,  of  which  23  and  14,  31  and24,  12  and  34  are 
known  as  opposite  sides  (opposite  vertexes). 


Stereometric  Problems 
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Steiner’s  Division  of  Space  by  Planes 


What  is  the  maximum  number  of  parts  into  which  a  space  can  be  divided  by  n 
planes? 

This  very  interesting  problem  appears  in  Steiner’s  paper  “Several  laws 
governing  the  division  of  planes  and  space”  ( Crelles  Journal,  vol.  1  and 
Steiner’s  Complete  Works,  vol.  I). 

We  first  solve  the  preliminary  problem:  What  is  the  maximum  number  of 
parts  into  which  a  plane  can  be  divided  by  n  straight  lines? 

The  number  of  parts  will  evidently  be  maximal  when  no  two  lines  are  parallel 
and  no  more  than  two  lines  pass  through  one  point.  In  the  following  we  will 
assume  these  two  conditions  to  be  satisfied  and  we  will  designate  the 
corresponding  number  of  surface  sections  generated  by  the  n  lines  as  n. 

Thus,  let  the  plane  be  divided  by  n  lines  into  n  surface  sections.  We  now  draw 
one  additional  line.  This  line  is  divided  by  the  first  n  lines  into  n  points,  and  thus 
traverses  n  +  1  of  the  available  n  surface  sections,  dividing  each  of  them  into  two 
parts,  so  that  the  ( n  +  l)th  line  increases  the  number  of  surface  sections  by  n  +  1. 
Consequently,  we  obtain  the  equation 


n  +  1  =  n  +  (n  +  1). 

We  then  apply  this  equation  to  the  cases  in  which  n  =  0,  1,2,...  and  we  form 
the  n  equations 


1  =  1  +1, 

3  =  T  +  2, 

3  =  5  +3, 

n  —  n  —  1  +  n. 

Addition  of  these  equations  results  in 

5  =  1  +  (1  +  2  +  3+  +  n) 

or,  since  the  sum  of  the  first  n  natural  numbers  is  n(n  +  l)/2, 


(1) 


2 


Thus,  the  maximum  number  of  parts  into  which  a  plane  can  be  divided  by  n 
lines  is  (n2  +  n  +  2)12. 

The  obtained  result  is  easily  confirmed  for  the  cases  n  =  1,2,3,.... 

Now  for  the  space  problem!  It  is  apparent  that  the  number  of  partial  spaces 
attains  a  maximum  when  no  more  than  three  planes  ever  intersect  at  one  point 
and  when  the  lines  of  intersection  of  no  more  than  two  planes  are  ever  parallel. 
We  will  therefore  assume  that  these  conditions  are  satisfied  in  the  following  and 
we  designate  the  number  of  partial  spaces  formed  by  n  planes  as  h. 

Then,  let  the  space  be  divided  by  n  planes  into  h  partial  spaces.  To  these 
planes  we  now  add  one  additional  plane.  This  plane  is  cut  by  the  original  n 
planes  into  n  lines  of  which  no  more  than  two  pass  through  a  single  point  and  no 
two  or  more  are  parallel.  The  new  ( n  +  l)th  plane  is  therefore  divided  by  the  n 
lines  into  h  surface  sections. 

Each  of  these  «  surface  sections  cuts  the  partial  space  that  it  traverses  into  two 
smaller  spaces,  so  that  the  addition  of  the  (n  +  1)  th  plane  increases  the  number 
of  the  partial  spaces  originally  present  by  n.  This  gives  us  the  equation 


n  +  1  =  H  +  n. 

We  form  this  equation  for  the  cases  n  =  1,  2,  3,  etc.,  and  obtain  the  n  equations 

1  =  1  +1, 

2  =  1  + 1, 

3  =  2  +2, 

ft  =  n  —  1  +  «—  1. 

Addition  of  these  equations  results  in 


n  =  2  +  I  +  S  +  3+  •••+«  —  ! 


or,  according  to  (1), 


n  =  n  +  1  +  $(1  •  2  +  2-3  +  •  •  •  +  (n  —  l)n). 


If  we  then  divide  each  product  v(v  +  1)  into  v2  +  v,  we  obtain 

n  =  n  +  1  +  ±{[12  +  2a  +  +  (r  -  l)2] 

+  [1  +  2  +...+(»  -  1)]}. 


Now,  according  to  No.  11,  the  sums  in  the  first  and  second  square  brackets, 
respectively,  are 

£(n  —  \)n(2n  —  1)  and  \{n  —  l)n,  respectively; 
the  brace  thus  equals  \(n  -  1  )n(n  +  1),  and 

H  =  n  +  1  +  $(n  —  l)n(n  +  1) 


or 


n3  +  5n  +  6 

*■ - e — 

Conclusion:  The  maximum  number  of  parts  into  which  a  space  can  be 
divided  by  n  planes  is  (n3  +  5n  +  6)/6. 
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Euler’s  Tetrahedron  Problem 


To  express  the  area  of  a  tetrahedron  in  terms  of  its  six  edges. 

This  fundamental  problem  was  posed  and  solved  by  Leonhard  Euler  (Novi 
Commentarii  Academiae  Petropolitanae  ad  annos  1752  et  1753). 

The  following  convenient  and  simple  solution  is  based  upon  vector  calculus. 

We  will  designate  the  vertexes  of  the  tetrahedron  as  A,  B,  C,  O,  the  six  edges 
BC,  CA,  AB,  OA,  OB,  OC  as  a,  b,  c,  p,  q,  r,  the  three  vectors  of  OB,  OC->  as  P  ,  q 
,  r  ,  and  the  area  we  are  looking  for  as  T.  We  will  consider  the  edges  p  ,  q  ,  r 
originating  from  the  vertex  O  as  being  so  arranged  that  they  form  a  right-handed 
system,  i.e.,  that  p  can  be  imagined  as  the  thumb,  q  as  the  index  finger,  and  t 
as  the  middle  finger  of  the  right  hand. 

If  we  take  the  triangle  OAB  as  the  base  surface  and  the  vertex  C  as  the  apex 
of  the  tetrahedron,  then  the  double  value  of  the  base  surface  area  S  is  given  by 
the  magnitude  of  the  vector  product  3  =  p  x  q  ,  the  altitude  CF  is  the  projection 
of  the  edge  r  on  CF,  i.e.,  rO,  if  we  designate  as  0  the  cosine  of  the  angle  between 
CO  and  CF  or  also  of  the  angle  of  the  two  vectors  3  and  r  . 

Consequently,  six  times  the  tetrahedron  area  is  equal  to  S-ro  or  equal  to  the 
scalar  product*  3  t  of  the  vector  3  and  r  .  Thus,  we  obtain  the  simple  formula 


6  T  =  p  x  q  r, 


which  can  be  stated  verbally  as  follows: 

Six  times  the  area  of  a  tetrahedron  is  equal  to  the  mixed  product  of  the  three 
vectorial  edges  originating  from  one  edge  of  the  tetrahedron. 

Here  the  three  factors  of  the  mixed  product  must  be  written  in  such  sequence 
as  to  form  a  right-handed  system  (for  otherwise  the  mixed  product  would 
represent  six  times  the  negative  tetrahedron  area). 


We  now  introduce  a  right-angle  coordinate  system  with  origin  at  O  and 
designate  the  coordinates  of  the  three  vertexes  A,  B,  C  as  x\y\z,  x'\y'\z',  and  x"\y”\z 
" .  The  three  components  of  the  vector  3  =  p  x  r  are  then  yz'  —  zy'  —  zx'  -  xz',  xy' 
-yx'  and  the  scalar  product  3  r  is  equal  to  (yz '  -  zy')x "  +  (zx'  -  xz')y”  +  (xy'  -  yx 
f)z",  i.e.,  equal  to  the  determinant  whose  columns  are  the  components  of  the 
vectors  p  ,  q  ,  r  .  Thus  we  obtain  the  elegant  formula 


6T  = 


x  y  z 
x’  y'  z' 
x"  y"  zm 


On  squaring  this  formula,  multiplying  the  two  (same)  determinants  row  by  row, 
we  obtain  36  T2  =  A  = 


xx+yy  +  zz  x  x'  +  y  y' 
x'x  +  y'y  +  z'z  x'x’+y'y' 
x'x  +  y'y  +  xV  +  y'y' 


+  z  z'  x  x"  +  y  y*  +  zz" 
+  z'z'  x'x'  +  y'y'  +  z'z' 
+  z'z'  x'x'  +  y'y'  +  z'z' 


or,  since  the  elements  of  this  determinant  are  the  scalar  products  of  the  vectors  p 


,  q  ,  t  in  pairs,  or  the  squares  of  these  vectors, 


PP 

pq 

pr 

(I) 

36  T3  = 

qp 

qq 

qr 

rp 

tq 

rr 

This  is  Eulers  tetrahedron  formula.  (Euler,  however,  expressed  the  right-hand 
side  as  an  algebraic  sum  rather  than  as  a  determinant.) 

It  contains  the  solution  to  the  problem  posed,  since  the  elements  of  the 
determinant  are  simple  expressions  of  the  edges;  specifically: 

PP  =  p3,  qq  =  q2,  rt  =  r3, 

q3  +  r3  —  a 3  r3  +  p3  —  b3  p3  +  q3  —  c3 

qt  =  1 - - - ,  tp  =  - tL - ,  pq  =  £ - 1 - 

In  the  tetrahedron  with  the  edges  a  =  11,  b  =  10,  c  =  9,  p  =  8,  q  =  7,  r  =  6,  for 
example,  we  have 


pp  =  64,  qq  =  49,  rt  =  36,  qr  =  — 18,  tp  =  0,  pq  =  16, 


and 


64 

16 

0 

4 

16 

0 

36  T2  = 

16 

49 

-18 

=  16  36 

1 

49 

-9 

0 

-18 

36 

0 

-1 

1 

16  36  9  16 


and  T=  48. 

We  can  put  the  obtained  result  into  still  another  form. 

If  we  multiply  each  element  of  A  by  2  and  express  the  doubled  scalar  product 
by  the  squares  P,  Q,  P,  A,  B,  C  of  the  edge  magnitudes  p,  q,  r,  a,  b,  c,  we  obtain 


288  T3 


2  P 

Q  +  P  -C 
R  +  P  -  B 


P+  Q -C 
2  Q 

R  +  Q  —  A 


P  +  R  —  B 
Q  +  R-  A 
2  R 


Now  we  distribute  zeros  at  the  left  and  minus  ones  at  the  bottom  and  obtain 


2 P  P  +  Q -C  P  +  R- B 

Q  +  P-C  2  Q  Q  +  R- A 

R  +  P - B  R  +  Q -  A  2 R 

-1  -1  -1 

If  we  add  the  P-,  Q-,  and  /^-multiples  of  the  last  row  to  the  first,  second,  and 
third  rows,  respectively,  we  obtain  the  somewhat  simpler 


288  r2  = 


0 

0 

0 

-1 


288  T2 


-p 

p 

Q-C 

R-  B 

-Q 

P -C 

Q 

R-  A 

-R 

P  -  B 

1 

o 

R 

-1 

-1 

-1 

-1 

We  now  distribute  zeros  and  ones  at  the  top  and  right: 


0 

0 

0 

0 

1 

-P 

P 

Q-C 

R-B 

1 

288  T2  - 

-Q 

P -C 

Q 

R-  A 

1 

-R 

P  -  B 

Q-A 

R 

1 

-1 

-1 

-1 

-1 

0 

If  we  now  subtract  the  P-,  Q-,  and  /^-multiples  of  the  last  column  from  the 
second,  third,  and  fourth  columns,  respectively,  we  finally  obtain 


288  T2 


0  -P  -Q  -R  1 

-P  0  -C  -B  1 

-Q  -C  0  -A  1 

-R  -B  -A  0  1 

-1  -1  -1  -1  0 


or,  if  we  reverse  all  the  minus  signs, 


(II) 


288  r2  = 


0  P  Q  R  1 

P  0  C  B  1 

QCO  /!  1  . 

R  B  A  0  1 

11110 

In  this  remarkable  formula  P,  Q,  R,  A,  B,  C  are  the  squares  of  the  edges  p,  q,  r,  a, 
b,  c. 

Note:  the  four-point  relation:  If  A,  B,  C,  O  are  four  points  of  a  plane,  the 
area  of  the  tetrahedron  ABCO  is  zero  and  (I)  is  transformed  into  the  so-called 
four-point  relation: 


hh  hq  hr 
qp  qq  qr 


-  0 


rp  rq  rr 


for  the  six  junction  lines  BC  =  a,  CA  =  b,  AB  =  c,  OA  =  p,  OB  =  q,  OC  =  r  that 
are  possible  between  the  four  points. 
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The  Shortest  Distance  Between  Skew  Line. 


To  calculate  the  angle  and  distance  between  two  given  skew  lines. 


This  important  problem  is  usually  encountered  in  one  of  the  following  two 
forms: 

I.  To  calculate  the  angle  and  distance  between  two  skew  lines  when  a  point  on 
each  line  and  the  direction  of  each  line  are  given — the  former  by  coordinates 
and  the  latter  by  the  direction  cosine  of  the  lines. 

II.  To  calculate  the  angle  and  distance  between  two  opposite  edges  of  a 
tetrahedron  whose  six  edges  are  known. 

The  distance  between  two  skew  lines  is  naturally  the  shortest  distance 
between  the  lines,  i.e.,  the  length  of  the  line  perpendicular  to  both  lines  and 
joining  a  point  on  each. 

Solution  of  I.  We  designate  the  perpendicular  coordinates  of  the  two  given 
points  P  and p  as  A\B\C  and  a\b\c,  the  vector  (with  the  components  A-  a,  B  - 

b,C-c)  as  b ,  the  direction  cosine  of  the  two  lines,  together  with  the  components 


of  two  unit  vectors  (j  and  e  lying  on  the  lines  as  L,  M,  N  and  I,  m,  n,  the  sought- 
for  angle  of  the  two  lines  as  co,  and  the  sought-for  minimum  distance  as  k. 

The  solution  to  this  problem,  which  is  in  itself  not  very  simple,  becomes 
astonishingly  simple  with  the  introduction  of  the  scalar  product  e  and  the 
vector  product  q  x  e  of  the  two  vectors  <g  and  e  ■ 

The  former  can  be  expressed  on  the  one  hand  (since  the  vectors  ®  and  e 
have  a  magnitude  of  1)  as  cos  co,  and,  on  the  other,  by  the  components  of  the 
factors  as  LI  +  Mm  +  Nn.  We  therefore  obtain 


(1)  cos  w  —  LI  +  Mm  +  Nn. 

The  latter  is  perpendicular  to  both  lines,  so  that  the  projection  of  b  on  the 
vector  x  e  represents  the  desired  distance  k  (the  shortest  distance  k  between 
the  two  lines  is  specifically  the  projection  of  b  on  k  and  at  the  same  time  the 
projection  of  b  on  every  parallel  to  k„  for  example,  on  q  x  c  ).  However,  since 
the  projection  of  a  vector  on  a  second  vector  *>  of  the  magnitude  v  is  ®  •»/»,, 
we  obtain  for  k  the  value  b  •  ©  x  c/sin  co  (sin  co  is  the  magnitude  of  the  vector  (® 
x  e  )• 

Now  the  scalar  product  of  the  two  vectors  and  x  e  is  nothing  other  than 
the  so-called  mixed  product  of  the  three  vectors  ®  ,  and  e  .  And  since  the  latter 
is  equal  to  the  determinant  whose  rows  are  the  components  of  the  three  vectors 
(No.  68),  we  obtain  the  formula 


(2) 


A  —  a  B  —  b  C  —  c 
L  M  N 

l  m  n 


/sin  a>. 


Note.  If  we  desire  to  calculate  the  coordinates  XJY/Z  and  x/y/z  of  the  end 
points  U  and  u  of  the  shortest  junction  line  k,  we  designate  the  segments  PU  and 
pu  as  R  and  r,  the  vector  uf;  as  f ,  and  we  then  have 

uU  =  up  +  pP  +  PUt 


or 


f  =  -re  +  b  +  R<&. 

If  we  multiply  this  equation  in  scalar  fashion  with  q  and  e  ,  we  obtain,  as  a 
result  of  (j  -i=0  and  e  •  j  =  0,  the  two  linear  equations 


G&R  -  Qer  +  ©b  =  0, 

(Sc/2  —  cer  +  cb  =0, 

from  which  the  unknowns  R  and  r  are  obtained. 

Solution  of  II.  Let  the  six  edges  of  the  tetrahedron  be  BC  =  a,  CA  =  b,  AB 
=  c,  OA=  p,  OB  =  q,  OC  =  r,  and  let  the  vectors  bc ,  Ci,  AB,  6 A,  OB,  OC  be 
a,  b,  c,  v,  q,  r.  Let  the  angle  and  distance  between  the  two  opposite  edges  c  and  r 
be  called  co  and  k,  respectively. 

Determination  of  co.  We  have 

c  +  r  =  AB  ■+■  OC  =  AO  +  OB  4-  OA  +  AC  =  OB  -f  AC  —  q  —  b, 


and  thus 


(c  +  r)a  —  (c  +  r)  •  (q  -  b)  =  cq  +  qr  -  bc  -  br. 


However,  since 


(c  4-  r)2  =  c2  +  r2  +  2cr  =  c3  +  r2  +  2cr  cos  c*>, 

2cq  =  c3  +  q3  —  p 3,  2qr  =  q3  +  r2  —  a2, 

2bc  =  a2  —  b2  —  c2,  2br  =  p2  —  b3  —  r2, 

the  equation  obtained  is  transformed  into 

(3)  2 cr  cos  to  =  b2  +  q2  —  a2  —  p2, 

so  that  co  is  determined. 

Calculation  of  k.  Let  the  area  of  the  tetrahedron  ABCO,  which  we  can 
consider  as  known  in  accordance  with  Euler’s  formula  (No.  68),  be  called  T.  We 
displace  the  vector  t  parallel  to  itself  until  it  has  a  starting  point  A  in  common 
with  c;  its  new  end  point  we  will  call  Q,  and  thus  AQ  #  OC.  Since  the  triangles 

CQA  and  CO  A  are  halves  of  the  parallelogram  CO  A  Q,  they  are  congruent,  and 

thus  the  tetrahedrons  CQAB  and  COAB  have  the  same  area  ( T ).  If  we  now  take 
QAB  as  the  base  surface  of  the  tetrahedron  CQAB  and  C  as  the 


FIG.  82. 

apex,  the  base  surface  has  the  area  \AQ  ■  AB  ■  sin  QAB  =  \rc  sin  co,  and  the 
altitude  (as  the  distance  of  the  point  C  from  the  plane  QAB  that  contains  the  edge 
c  and  the  line  AQ  that  is  parallel  to  the  opposite  edge  OQ  has  a  length  of  k.  The 
area  of  the  tetrahedron  is  therefore  \-\cr  sin  co  ■  k  and  we  obtain  the  formula 

(4)  6T  =  kcr  sin  co. 

Since  all  the  magnitudes  in  this  formula  are  known  with  the  exception  of  k,  it 
gives  us  the  distance  between  the  opposite  edges  k  which  we  have  been  looking 
for. 

Note.  If  we  keep  in  mind  that  cr  sin  co  is  the  magnitude  of  the  vector  c  x  r 
and  that  the  shortest  distance  f  (conceived  of  as  a  vector)  between  the  edges  c 
and  r  is  parallel  to  x  r,  we  can  write 


6f=  l  c  x  t 


and  we  have  the  following 

Theorem:  The  mixed  product  of  two  opposite  sides  of  a  tetrahedron  and  the 
distance  between  them  is  equal  to  six  times  the  area  of  the  tetrahedron. 

A  direct  consequence  of  this  theorem  is  the  famous 

Theorem  of  Steiner:  All  tetrahedrons  having  two  opposite  edges  oj 
prescribed  length  lying  on  two  fixed  lines  have  the  same  area. 
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The  Sphere  Circumscribing  a  Tetrahedron 


To  determine  the  radius  of  the  sphere  circumscribing  a  tetrahedron  of  which 
all  six  edges  are  given. 


One  should  compare  the  developments  of  Legendre  in  his  Elements  de 
Geometrie ,  Note  V. 

We  will  first  solve  the 

Preliminary  problem:  To  find  the  relation  between  the  six  major  arcs  that 
connect  the  four  points  of  a  spherical  surface. 

We  will  call  the  four  points  0,  1,  2,  3,  the  arcs  joining  them  01,  02,  03,  23,  31, 
12,  the  radii  (considered  as  vectors)  running  to  them  r0,  r„  r2)  r3  and  their 
common  magnitude  h.  Since  there  is  always  a  homogeneous  linear  relation 
between  four  vectors  of  a  space,  we  have  the  equation 

«r0  +  fixx  +  yra  +  Sr3  =  0, 

in  which  not  all  of  the  coefficients  a,  f,  y,  3  vanish  simultaneously.  We  multiply 
the  relation  sequentially  in  scalar  fashion  by  r0,  xlt  r2,  r3  and  obtain  the  four 
equations 

r0r0a  +  r0r,/3  +  r0ray  +  r0r3S  =  0, 
r^oa  +  r,r,/3  +  r^y  +  rxr3S  =  0, 
r2r0a  +  r2r,0  +  r2t2y  4-  r2r38  =  0, 
r3r0a  +  x3x^  +  t3r2y  +  r3t3S  =  0. 

However,  when  four  homogeneous  linear  equations  with  four  unknowns  (a,  ft,  y, 
3 )  possess  an  actual  solution,  the  determinant  of  the  coefficients  of  the  equations 
must  be  equal  to  zero.  Consequently 

roro  Vi  x0x2  t0r3 
tit0  Tfr  t,x2  x,x3 
r2r0  r3rj  r2r2  rar3 
x3x0  t3r,  r3ra  r3r3 

Here  we  replace  each  product  r„rv  by  h2  cos  nv,  eliminate  everywhere  the  factor 
h2,  and  obtain  the  relation  we  are  looking  for 


(1) 


=  0. 


cos  00  cos  01  cos  02  cos  03 

cos  10  cos  1 1  cos  12  cos  13 

cos  20  cos  2 1  cos  22  cos  23 

cos  30  cos  3 1  cos  32  cos  33 

(cos  00,  cos  11,  cos  22,  cos  33  are  naturally  merely  symmetrical  ways  of  writing 
unity.) 

The  solution  of  the  tetrahedron  problem  is  now  simple. 

In  order  to  maintain  agreement  with  the  designations  of  the  preliminary 
problem  we  will  call  the  vertexes  of  the  tetrahedron  0,  1,2,  3,  the  radius  of  the 
sphere  of  circumscription  h.  The  edges  1,  02,  03,  23,  31,  12  we  will  call  p,  q,  r,  a, 
b,  c,  their  squares  P,  Q,  R,  A,  B,  C,  the  area  of  the  tetrahedron  T. 

We  now  introduce  the  four-point  relation  (1),  assign  to  each  cosine  the  factor 
H  =  2 h2  and  replace  the  new  determinant  elements  in  accordance  with  the  cosine 
theorem,  e.g.,  H  cos  01  by  H  -  P,  H  cos  02  by  H  -  Q,  H  cos  23  by  H  -  A,  etc. 

(naturally  H  cos  00  and  the  other  elements  of  the  diagonals  will  be  replaced  by 

H).  This  gives  us,  after  we  reverse  the  sign  of  all  the  elements, 


-  H 

P  -  H 

Q 

-  H 

R  -  H 

P  -  H 

-  H 

C 

-  H 

B  -  H 

Q-H 

C  -  H 

-  H 

A  -  H 

R  -  H 

B  -  H 

A 

-  H 

-  H 

We  now  line  the  bottom  of  this  determinant  with  ones  and  the  right-hand  side 
with  zeros  and  obtain 


-  H 

P  -  H 

Q-H 

R-H 

0 

P  -  H 

-  H 

C  -  H 

B  -  H 

0 

Q-H 

C  -  H 

-  H 

A  -  H 

0 

R  -  H 

B  -  H 

A  -  H 

-  H 

0 

1 

1 

1 

1 

1 

We  now  add  to  the  first,  second,  third,  and  fourth  rows  H  times  the  last  row;  this 
gives  us 


0  P  Q  R  H 
P  0  C  B  H 
Q  C  0  A  H 
R  B  A  0  H 
11111 


=  0. 


If  we  call  the  minors  of  the  last  column  Mx,  M2,  M3,  M4,  M5  and  arrange  them 
according  to  the  elements  of  the  last  column,  we  obtain 


// ( Afj  +  Afa  +  A/3  +  A/4)  +  Af5  =  0. 

If  we  also  arrange  the  determinant  of  equation  (II)  of  No.  68  according  to  the 
elements  of  the  last  column,  that  equation  assumes  the  form 


A/x  +  M2  +  M3  +  Af«  =  288  T2. 


From  the  last  two  equations  we  obtain 

28 8HT2  =  -Ms, 


where 


M5 


0  P  Q  R 
P  0  C  B 
Q  C  0  A 
R  B  A  0 


Computation  gives 


—M6  -  2FG  +  2 GE  +  2 EF  -  E*  -  F2  -  G3, 


where  E,  F,  G  are  the  three  products  AP,  BQ,  CR.  If  we  replace  A,  B,  C,  P,  Q,  R 
once  again  by  a2,  b 2,  c 2,  p 2,  q2,  r2  and  designate  the  products  ap ,  bq,  cr  of  the 
opposite  edges  as  e,f  g,  the  last  formula  can  be  written  as 


-A/s  =  2/V  +  2  gV  +  2e3/3  -A -S'-  g*. 

If  we  consider  e,  f  g  as  sides  of  a  triangle,  the  right  side  of  this  formula 
(according  to  Hero)  represents  16  times  the  square  of  the  area  j  of  this  triangle. 


Thus  the  equation  found  for  H  =  2/r  is  transformed  into 

576 h2T2  -  16; 2, 

and  from  this  we  can  obtain  the  simple  formula 

6  hi'  -  j 

for  the  radius  of  the  sphere  of  circumscription.  Verbally,  this  can  be  stated  as 
follows: 

Six  times  the  product  of  a  tetrahedron  volume  and  the  radius  of  its  sphere  oj 
circumscription  is  equal  to  the  area  of  a  triangle  whose  sides  are  the  products  oj 
the  opposite  edges  of  the  tetrahedron. 

Note.  The  question  of  the  radius  p  of  the  sphere  inscribed  in  a  tetrahedron  is 
much  simpler.  The  lines  joining  the  center  Z  of  the  inscribed  sphere  and  the 
boundary  points  of  the  four  triangles  bounding  the  tetrahedron  divide  the 
tetrahedron  into  four  pyramids  with  the  common  apex  Z  and  the  areas  \pl,  \pll, 
$pIV,  where  1,  11,  111,  IV  are  the  areas  of  the  bounding  triangles.  We  thus 
obtain  the  formula 


^(i  +  ii  +  iii  +  iv). 

This  equation  represents  p  as  a  function  of  the  tetrahedron  edges,  since  I,  II,  III, 
IV,  and  T  are  known  functions  of  the  edges. 
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The  Five  Regular  Solids 


To  divide  the  surface  of  a  sphere  into  congruent  regular  spherical  polygons. 


Solution.  We  will  call  the  required  division  “regular”  and  we  will  first  answer 
the  question  concerning  the  maximum  possible  number  of  regular  divisions. 

We  will  assume  that  the  sphere  is  covered  completely  and  without  any  gaps 
by  z  regular  «-gons  and  that  at  every  corner  of  such  an  n-gon  v  sides  come 
together.  We  divide  each  n- gon  by  means  of  the  spherical  radii  running  from  the 
center  to  the  vertexes  into  n  isosceles  triangles.  Each  of  these  triangles  possesses 
the  central  angle  2jz/n  and  the  base  angle  tt/v  (since  at  each  vertex  2v  such  base 
angles  come  together),  and  thus  the  spherical  excess  of  each  is 


c 


■C-M 


2n  2n 
n  v 

Now,  the  area  of  such  a  triangle,  when  r  is  the  spherical  radius  is  r2e;  the  area  of 
an  n-gon  is  thus  nr^e  and  the  area  of  the  spherical  surface  consisting  of  z  such  n- 
gons  is  znr2s.  Accordingly,  we  obtain  the  equation 

znr2e  =  \irr2 


or 


n  v  zn 


Since  the  left  side  of  this  equation  is  >  1  and  at  the  same  time  n  as  well  as  v  must 
be  >2,  we  obtain  the  following  five  possibilities  for  n,  v,  and  z: 


n 

V 

z 

3 

3 

4 

3 

4 

8 

3 

5 

20 

4 

3 

6 

5 

3 

12 

Thus,  there  are  only  five  possible  regular  divisions  of  a  spherical  surface:  by 
dividing  the  surface  with 

1 .  four  regular  triangles, 

2.  six  regular  tetragons, 

3.  eight  regular  triangles, 

4.  twenty  regular  triangles, 

5.  twelve  regular  pentagons. 

If  we  connect  every  two  adjacent  corners  of  such  a  spherical  n-gon  by  means 
of  a  line  segment,  we  obtain  a  regular  plane  n-gon  bounded  by  the  n  line 
segments  that  connect  the  corners.  If  we  construct  this  plane  n-gon  for  each  of 
the  z  spherical  n-gons,  we  obtain  a  regular  polyhedron  bounded  by  z  regular  n- 


gons,  or  a  so-called  regular  solid. 

There  are  accordingly  only  five  regular  solids ,  namely,  the  regular 
tetrahedron,  hexahedron  (the  cube),  octahedron,  icosahedron ,  and 
dodecahedron. 

In  the  following  we  will  actually  carry  out  the  five  regular  divisions  of  the 
spherical  surface,  which  we  had  initially  only  shown  to  be  possible.  For 
convenience  in  viewing  the  sphere  we  will  imagine  it  as  a  globe  with  a  north 
pole  N  and  a  south  pole  S  and  with  meridians  and  latitudinal  circles. 

I.  The  tetrahedron  (n  =  3,  v  =  3,  z  =  4).  On  the  three  meridians  0°,  120°,  240° 
we  lay  off  from  N  the  three  equal  arcs  NA,  NB,  NC  such  that  the  triangles  NBC, 
NCA,  NAB  are  equilateral.  The  three  arcs  BC\  CA,  AB  enclosing  the  south  pole 
then  also  form  an  equilateral  triangle  that  is  congruent  to  the  designated 
triangles,  and  the  spherical  surface  has  been  divided  into  the  four  regular 
triangles  NBC,  NCA,  NAB,  ABC. 

II.  The  hexahedron  (n  =  4,  v  =  3,  z  =  6).  On  the  four  meridians  0°,  90°,  180°, 
270°  we  lay  off  from  N  and  S  the  eight  equal  arcs  NA,  NB,  NC,  ND  and  SC',  SD', 
SA',  SB’  (each  one  equal  to  h)  such  that  each  of  the  arcs  AO,  BD\  CA',  DB’  is 
equal  to  AB  (=  2k).  k  is  obtained  from  the  spherical  triangle  NAB  by  means  of  the 
equation 


cos  2k  =  cos  h  cos  h. 

Since  on  the  one  hand  2 h  +  2k  =  NA  +  SO  +  AO  =  NS  =  1 80°  or  h  +  k  =  90°,  and 
thus  cos  h  =  sin  k,  and  on  the  other  hand  cos  2k  =  1  -  2  sin2  k,  we  obtain 

1—2  sin2  k  —  sin3  k 


and  consequently 


sin  k  =  vX  cos  2k  =  cos  h  =  vj. 

The  corners  A,  B,  C,  D,  A',  B’,  O,  D’  defined  by  these  conditions  are  the  eight 
corners  of  the  cube. 

III.  The  octahedron  (n  =  3,  v  =  4,  z  =  8).  The  corners  of  the  octahedron  are  the 
points  N,  S  and  four  equator  points  separated  from  each  other  by  90°. 

IV.  The  icosahedron  (n  =  3,v  =  5,  z  =  20).  We  choose  ten  meridians  36°  apart 
and  call  them  1,  2,  3,. . .,  10.  On  the  meridians  1,  3,  5,  7,  9  we  lay  off  from  N  the 
equal  arcs  NA,  NB,  NC,  ND,  NE,  and  on  the  meridians  6,  8,  10,  2,  4  we  lay  off 
from  S  the  equal  arcs  SA',  SB',  SC',  SD',  SE'  such  that  the  ten  triangles  NAB, 


NBC,  NCD,  NDE,  NEA,  SA'B',  SB'C',  SC'D',  SDE',  SEA'  are  equilateral.  The 
common  length  2k  of  the  marked-off  arcs  can  be  obtained,  for  example,  from 
one  of  the  right  triangles  NBO,  NCO,  into  which  the  meridian  4  divides  the 
equilateral  triangle  NBC.  Since  EBNO  =  36°,  _;_OBN  =  72°,  it  follows  from 
triangle  NBO  that 


cos  BO  = 


cos  k  — 


cos  36° 
sin  72° 


1 

2  sin  36° 


and  from  this  that  2k  =  63°26'. 


N 


If  we  extend  NO  by  its  own  length  to  H ,  we  obtain  the  isosceles  triangle  NBH 
with  the  base  NH  =  2 h  and  the  legs  BN  =  BH  =  2k,  the  base  angle  36°,  and  the 
apex  angle  HBN=  144°.  Since  these  angles  have  the  same  sine,  the  sines  of  their 
opposite  sides  NH  and  NB  are  equal  according  to  the  sine  theorem.  But  since 
these  opposite  sides  (2 h  and  2k)  are  not  equal,  2 h  must  be  the  supplement  of  2k. 
And  since  NE'  is  also  the  supplement  of  2k  (=  SE^,  then  necessarily 


NE'  =  2h  =  NH. 


Accordingly,  point  H  coincides  with  E'  and  E'B  is  equal  to  2k,  i.e.,  equal  to  NB. 
In  similar  fashion  each  of  the  arcs  AD',  D'B,  E'C,  CA',  A'D,  DB',  B'E,  EC',  C'A  is 
equal  to  2k,  and  the  ten  “encircling”  triangles  ABD',  D'E’B,  BCE',  E'A'C,  CDA', 
A'B'D,  DEB',  B'C'E,  EAC',  C'D' A  are  likewise  equilateral  triangles  and  also 
congruent  to  the  ten  equilateral  triangles  above. 

The  12  points  N,  S,  A,  B,  C,  D,  E,  A',  B',  C',  D',  E'  are  thus  the  vertexes  of  20 
equilateral  triangles  that  completely  cover  the  sphere;  they  are  the  12  corners  of 


the  regular  icosahedron. 

V.  The  dodecahedron  (n  =  5,  v  =  3,  z  =  12).  As  in  the  icosahedron,  we  begin 
the  construction  of  the  dodecahedron  by  laying  off  a  system  of  ten  meridians  1, 
2,  3,...,  10  that  are  36°  apart.  About  A  as  a  common  apex  we  group  five 
congruent  isosceles  triangles  NAB,  NBC,  NCD,  NDE,  NEA  with  the  apex  angle 
72°  and  the  base  angle  60°  (=  180°/v)  whose  base  vertexes  A,  B,  C,  D,  E  he  on 
the  meridians  1,  3,  5,  7,  9.  Thus  we  obtain  the  regular  pentagon  ABODE.  In  the 
same  way  we  draw  about  S  as  a  common  center  point  the  regular  pentagon  A  B'C 
'DE'  whose  vertexes  A',  B',  C\  D',  E'  he  on  the  meridians  6,  8,  10,  2,  4. 


If  O  and  O'  represent  the  base  midpoints  of  the  isosceles  triangles  ABN  and  D 
E'S,  then  NAO  and  SD'O'  are  right  triangles  with  the  angles  60°  and  36°. 

Our  construction  is  now  based  on  the  theorem  (proved  below):  “ The 
perimeter  of  a  spherical  right  triangle  with  angles  of  60°  and  36°  is  90°. ” 

If  we  designate  the  hypotenuse,  the  long  leg,  and  the  short  leg  of  such  a  triangle 
as  /,  h,  and  k,  then 

(1)  l  +  h  +  k  =  90°. 

If  we  remember  that 

NA  =  SD'  =  /,  NO  =  SO'  =  h,  AO  -  D'O'  =  it, 

we  see  that  2 k  is  the  side,  /  the  radius  of  the  circumscribed  circle  (on  the  sphere), 
h  the  radius  of  the  inscribed  circle,  and  s  =  l  +  h  the  altitude  of  the  pentagon 
ABODE  or  AB'C'DE'. 

We  now  mark  off  on  the  meridians  1,  3,  5,  7,  9  from  A,  B,  C,  D,  E  southwards 


and  on  the  meridians  6,  8,  10,  2,  4  from  A',  B',  C,  D',  E',  northwards  the 
pentagon  side  2k,  which  gives  us  the  points  F,  G,  H,  K,  L ,  F  G\  FT,  K'  L' . 

Now  since,  according  to  (1),  each  meridian  consists  of  the  four  segments  /, 
2 k,  s,  and  h,  it  follows  that  OG  and  O’H,  for  example,  represent  the  pentagon 
altitude  s;  i.e.,  the  pentagons  ABHGF  and  D'E'KHG  are  congruent  to  the  regular 
pentagon  ABODE.  The  same  is  naturally  true  of  the  pentagons  BCLKH ,  CDGF 
'L,  DEK'  H'G ',  EAFL  'K',  EA  F'LK ,  A  B  'H'GT',  B  'C'L  'K’H',  C'D  'GFL 

With  the  12  regular  pentagons  already  designated  the  sphere  is  completely 
covered. 

The  points  A,  B,  C,  D,  E,  F,  G,  H,  K,  L,  A',  B\  C',  D\  E\  F,  G',  H',  K',  L'  are 
accordingly  the  20  comers  of  the  regular  dodecahedron. 

Supplement:  Proof  of  the  Theorem:  “ The  perimeter  of  a  spherical  right 
triangle  with  the  angles  60°  and  36°  is  90°. ” 

Let  the  sides  of  the  triangle  be  a ,  b,  c,  their  opposite  angles  a  =  60°,  f  =  36°,  y 
=  90°.  We  express  the  tangents  of  the  sides  by  the  regular  decagon  side  z  -  2  sin 
1 8°  corresponding  to  the  unit  circle,  for  which  it  is  known  that,  z2  +  z  =  1 . 

1.  Firstly, 


cos  /3 


1  -  2  sin2  18°  =  1  -  $z2  = 


_1_ 
2  z 


or 


sec  p  =  2z. 


2.  From  sec  c  =  tan  a  tan  ft  it  follows  that  sec2  c  =  3  tan2  ft  or  (tan2  c  +  1)  = 
3(sec2  ft  -  1)  or  (tan2  c  =  4(3z2  -  1).  However,  3z2  -  1  =  z2  +  (2z2  -  1)  =  z2(l  - 
2z)  =  [  1  -  z2]  =  z4,  and  thus 


tan  c  =  2z2. 

3.  tan  a  =  tan  c  cos  f  =  2z2/2z  =  z. 

4.  tan  b  =  tan  c  cos  a  =  lz2-\  =  z2. 

Now  we  have 


tan  c  lan  (a  +  6)  =  2z2* 


z  +  z 2 
1  -  z3 


2z3 

1  -  z3 


2z2  2z2 

(1  -  z)[l  +  2  +  r>]  (j»)[l  +  1]  "  '• 


Consequently,  a  +  b  is  the  complement  of  c.  Q.E.D. 

The  regular  solids  were  already  known  to  the  Pythagoreans  and  thus  go  back 
to  the  sixth  century  b.c.  The  proof  that  there  are  only  five  regular  solids  probably 
stems  from  Euclid  (ca.  330-275  b.c.). 
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The  Square  as  an  Image  of  a  Quadrilateral 


To  show  that  every  quadrilateral  can  be  considered  as  a  perspective  image  oj 
a  square. 


The  perspective  projection,  perspectivity  or  central  projection,  the  simplest 
and  most  important  of  all  projections,  can  be  explained  as  follows.  Given  are  a 
fixed  point  Z,  the  center  of  projection,  and  a  fixed  plane  E,  the  plane  of  the 
image.  The  perspective  image  or,  more  briefly,  the  perspective  of  an  arbitrary 
point  P0  is  understood  to  mean  the  point  of  intersection  P  of  the  “projection  ray” 

ZP0  with  the  plane  of  the  image.  P0  is  the  “object,”  P  the  “image.”  The  image  of 
a  figure  is  the  totality  of  the  images  of  the  points  of  which  the  figure  (the  object) 
consists.  Thus,  the  perspective  of  a  straight  line  g0  is  a  straight  line  g,  namely  the 
intersection  of  the  plane  Zg0  with  the  plane  of  the  image. 

Of  particular  importance  is  the  perspective  projection  in  which  only  points  of 
a  plane  E0,  the  object  plane,  are  projected  onto  the  image  plane.  The  line  of 
intersection  «  of  the  object  plane  and  the  image  plane  is  called  the  axis  oj 
perspectivity.  The  axis  of  perspectivity  is  the  locus  of  the  object  point  that 
coincides  with  the  point  of  its  image.  An  arbitrary  object  line  and  its  image 
accordingly  intersect  at  the  axis. 

A  noteworthy  role  in  this  perspectivity  is  played  by  the  infinitely  distant 
points  of  the  object  plane.  Since  the  projection  rays  to  the  infinitely  distant 
points  of  E0  run  parallel  to  E0,  they  lie  in  a  plane  A  passing  through  Z  and 
parallel  to  E0  and  consequently  meet  the  image  plane  at  the  line  of  intersection  j 
of  this  plane  with  A.  This  line  of  intersection  is  called  the  vanishing  line  of  the 
object  plane  E0.  The  vanishing  line  is  parallel  to  the  axis  of  perspectivity. 

In  order  to  avoid  limiting  the  general  validity  of  the  above  theorem,  “The 
perspective  of  a  line  is  also  a  line,”  by  a  special  case,  we  call  the  totality  of 
infinitely  distant  points  of  E0  the  “infinitely  distant  line”  of  this  plane  and  can 
then  state  briefly  that: 

The  perspective  of  the  infinitely  distant  line  of  a  plane  is  the  vanishing  line  oj 


this  plane. 

The  place  at  which  the  image  g  of  an  arbitrary  line  g0  of  E0  intersects  the 
vanishing  line  /  and  which  is  the  image  of  the  infinitely  distant  point  of  g0  is 
called  the  vanishing  point  of  g0. 

Now  for  the  solution  of  our  problem! 


P  Q 

FIG.  85. 


Let  the  quadrilateral  ABCD  in  the  drawing  plane  E  be  the  given  quadrilateral, 
let  O  be  the  point  of  intersection  of  the  diagonals  AC  and  BD,  P  the  point  of 
intersection  of  the  opposite  sides  AB  and  CD,  Q  the  point  of  intersection  of  the 
opposite  sides  BC  and  DA.  Let  the  square  we  are  looking  for  be  called 
accordingly  A0B0CqD0,  the  point  of  intersection  of  its  diagonals  O0,  its  plane  E0. 

Since  the  points  of  intersection  P0  and  Q0  of  the  two  pairs  of  opposite  sides  lie 
on  the  infinitely  distant  line  of  E0,  their  images  P  and  Q  must  lie  on  the 
vanishing  line / of  the  perspectivity  passing  from  E0  to  E.  We  accordingly  choose 
the  line  PQ  as  the  vanishing  line  f  It  makes  no  difference  which  parallel  to  /  we 
choose  as  the  axis  of  perspectivity  a.  We  choose  the  parallel  through  A.  The 
points  of  intersection  of  the  axis  with  the  lines  CD,  BC,  OP,  OQ,  and  BD  we 
designate  as  H,  K,  M,  N,  and  S.  Since  each  object  line  meets  the  corresponding 
image  line  at  the  axis,  these  points  may  also  be  called  H0,  K0,  M0,  N0,  S0. 


In  the  quadrilateral  ABCD  the  opposite  sides  PBA  and  PCD  and  the  diagonals 
PO  and  PQ  form  a  harmonic  ray  pencil.  Since  the  ray  PQ  runs  parallel  to  the 
line  a,  the  segments  MA  and  MH  are  of  equal  length. 

In  the  quadrilateral  ABCD  the  opposite  sides  QCB  and  QDA  and  the 
diagonals  QO  and  QP  also  form  a  harmonic  ray  pencil.  Since  QP\\a,  the 
segments  NA  and  NK  are  also  equally  long. 

Since  the  diagonals  of  the  sought-for  square  must  meet  the  diagonals  of  the 
given  quadrilateral  at  the  axis,  the  diagonals  of  the  square  must  pass  through  A 
and  S.  The  point  of  intersection  O0  of  the  diagonals  accordingly  lies  on  the 
semicircle  with  the  diameter  AS  belonging  to  the  plane  E0. 

Since  the  midlines  M{)0{)  and  N0O0  of  the  square  pass  through  O0,  0{)  also 
lies  on  the  semicircle  with  the  diameter  MN  in  the  plane  E0. 

The  point  of  intersection  of  the  two  semicircles  is  the  center  point  O0  of  the 
square. 

The  sides  A()B()  and  C0D0  of  the  square  are  the  parallels  through  A  and  H  to 
MO(),  the  sides  B0C0  and  D{)A{)  of  the  square  are  the  parallels  through  K  and  A  to 
NO0. 

For  convenience  we  execute  the  drawing  (cf.  Figure  85)  in  the  drawing  plane 
itself.  Then,  in  order  to  obtain  the  spatial  perspectivity  we  are  looking  for,  we 
rotate  the  square  about  the  axis  a  as  an  axis  of  rotation  into  a  new  plane  E0  draw 
through  /  the  plane  A  parallel  to  Eq,  join  the  point  of  intersection  of  the 
diagonals,  O0,  now  lying  in  E0,  with  0,  and  designate  the  point  of  intersection  of 
this  connecting  line  with  A  as  Z. 

If  we  now  project  the  square  A ()B{)C{)D{)  lying  in  E{)  from  the  center  Z  onto  E, 
we  thereby  obtain  as  a  perspective  image  the  square  ABCD. 
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The  Pohlke-Schwarz  Theorem 


Four  arbitrary  points  of  a  plane  that  do  not  all  lie  on  the  same  line  can  be 
considered  as  an  oblique  image  of  the  corners  of  a  tetrahedron  that  is  similar  to 
a  given  tetrahedron. 

This  fundamental  theorem  of  oblique  parallel  projection,  proved  by  H.  A. 
Schwarz  (1843-1921)  in  1864  ( Crelles  Journal,  vol.  63;  also,  Schwarz, 
Gesammelte  Abhandlungen ),  includes  as  a  special  case  the  theorem  formulated 
in  1853  by  K.  Pohlke  (1810-1876): 


The  fundamental  theorem  of  oblique  axonometry:  Three  arbitrary 
segments  originating  from  a  single  point  in  a  plane  that  do  not  all  belong  to  the 
same  line  can  be  considered  as  the  oblique  image  of  a  tripod. 

Before  taking  up  the  proof  of  this  theorem  we  shall  make  several  prefatory 
remarks  about  oblique  projection,  affinity,  and  axonometry. 

An  oblique  projection  is  a  projection  of  a  plane  or  three-dimensional  figure, 
an  object  figure,  onto  the  drawing  plane  or  image  plane  in  which  each  object 
point  is  projected  onto  the  image  plane  by  a  “projection”  ray  drawn  in  a  fixed 
direction.  If  the  projection  rays  are  perpendicular  to  the  image  plane,  the  oblique 
projection  is  called  a  normal  or  orthogonal  projection. 

The  oblique  projection  of  points  of  a  plane  (the  object  plane)  onto  the  image 
plane  is  a  so-called  affinity. 

An  affinity  or  affine  projection  is  understood  to  mean  a  projection  of  an 
object  plane  onto  the  picture  plane  (which  may  also  lie  in  the  object  plane)  in 
which  the  points  of  the  object  plane  are  transfonned  into  points  of  the  image 
plane  in  such  manner  that  they  exhibit  the  following  fundamental  properties: 

I.  The  affine  image  of  a  line  is  also  a  line. 

II.  Parallelism  is  not  annulled  by  affine  projection.  (The  image  of  a 
parallelogram  is  a  parallelogram.) 

III.  The  ratio  of  parallel  segments  is  not  altered  by  affine  projection.  In  other 
words:  Parallel  segments  are  projected  in  the  same  proportion.  (This  third 
property  is  a  consequence  of  I.  and  II.) 

It  is  therefore  immediately  evident  that  the  oblique  projection  of  a  plane  onto 
a  second  plane  possesses  these  three  fundamental  properties. 

The  most  general  affinity  between  two  arbitrary  planes  E  and  E'  is 
determined  by  the  mutual  correspondence  between  two  arbitrary  triangles  ABC 
and  A'B'C'  of  these  planes,  where  A',  B',  C'  are  determined  as  the  affine  images 
of  A,  B,  C,  respectively.  The  affine  image  P'  of  an  arbitrary  object  point  P  (of  E) 
is  drawn  by  letting  AP  intersect  with  the  side  BC  at  H,  then  (according  to  III.) 
determining  the  affine  image  H'  of  H  on  the  line  B'C'  by  means  of  the  condition 
B'H':C'H'  =  BH:CH ,  and  finally  determining  P'  on  A'H'  by  means  of  the 
condition  A  'P'.H'P'  =  AP:HP. 

A  frequently  employed  method  of  drawing  the  oblique  projection  of  a  three- 
dimensional  figure  is  the  axonometric  method.  In  this  method  the  points  P  of  the 
three-dimensional  figure  are  determined  by  their  coordinates  x\y\z  most 
commonly  in  a  perpendicular  coordinate  system.  Three  equal  segments  OA,  OB, 
and  OC  are  laid  off  from  the  origin  O  on  the  axes;  these  segments  form  a  so- 
called  tripod.  The  oblique  outline  O' A'B'C'  of  the  tripod  is  drawn,  and  this  also 


gives  us  the  oblique  images  of  the  coordinate  axes.  We  then  construct,  in 
accordance  with  111.,  the  oblique  image  of  the  point  P,  which  in  this  context  is 
called  the  axonometric  image. 

It  is  now  of  fundamental  importance  to  know  whether  three  arbitrary 
segments  O' A ',  O'B',  O'C'  originating  from  a  point  O'  of  the  drawing  plane  can 
be  considered  as  the  oblique  projection  of  a  tripod  OABC.  This  question  was 
answered  by  Pohlke  and,  in  a  somewhat  more  general  fashion,  by  Schwarz,  as 
mentioned  above. 

Of  the  numerous  proofs  of  the  Pohlke- Schwarz  fundamental  theorem  the 
following  (stemming  from  Schwarz)  is  quite  elementary.  It  is  based  upon  the 
theorem  of  Lhuilier,  which  is  in  itself  very  interesting:  The  sections  of  an 
arbitrary  three-edged  prism  include  all  the  possible  forms  of  triangles.  In  other 
words:  Every  triangle  can  be  considered  as  the  normal  projection  of  a  triangle  oj 
given  form.  This  theorem  was  stated  in  1811  by  the  French-Swiss  mathematician 
Simon  Lhuilier  (1750-1840). 

Proof.  Since  parallel  sections  of  a  prism  are  congruent,  we  can  assume  that 
the  prescribed  triangle  v40f?0C0,  which  is  also  the  cross  section  of  the  prism,  and 

the  sought-for  prism  section  ABC,  which  possesses  a  prescribed  form,  have  a 
common  vertex,  C  =  C0.  If  we  now  drop  the  perpendiculars  A()X  and  B()Y  from 

A o  and  B0  to  the  intersection  line  (axis)  g  of  the  two  planes  E0  of  AqB0C0  and  E 
of  ABC  and  rotate  the  plane  E  about  g  as  the  rotation  axis  to  the  plane  E(),  then  A 
and  B,  as  the  figure  shows,  fall  on  the  perpendiculars  AqX  and  B0Y,  respectively, 
and  the  point  of  intersection  S  =  S0  of  the  lines  AqB0  and  AB  falls  on  the  axis. 


We  now  draw  the  perpendicular  to  the  axis  through  C  and  let  it  touch  A  {)B{)  at 


T0  and  AB  at  T.  If  we  designate  the  cosine  of  the  angle  formed  by  the  plane  E  in 
its  original  position  with  E0  as  g,  then  A(PC= g  •  AX,  B0Y= g  ■  BY,  T0C  =  g  •  TC. 

Now  according  to  the  ray  theorem, 

SA:AT:TB  =  SoA0:A0T0:T0B0. 

We  can  therefore  draw  a  parallel  SXAXTXBX  to  SATB  that  cuts  the  lines  g,  CA,  CT, 
CB  at  Sx  Alg  Tx  Bx  and  is  congruent  to  SqA0T0B0  (so  that  SXAX  =  SqA0,  AxTx  = 
A0TX,  TXBX  =  TqB0).  We  displace  the  triangle  SXBXC  in  such  a  way  that  Sx  falls  on 
S0  Ax  on  A0,  Tx  on  T0,  Bx  on  B0.  The  vertex  C  then  falls  on  a  point  V  of  the 
semicircle  &  described  about  the  diameter  S() T{)  (since  A SXCTX  is  a  right  triangle), 
on  which  C  lies,  also. 

From  this  fact  we  obtain  the  following  simple  method  for  constructing  the 
described  figure  when  the  triangle  AqBqCq  and  the  form  of  the  triangle  ABC  are 

given. 

We  draw  over  A0B0  the  triangle  A()B()  V  that  is  similar  to  the  triangle  ABC 
(with  A0  B0  V  being  homologous  to  A,  B,  C,  respectively).  We  let  the  median 
perpendicular  of  CV  intersect  with  A0B0  at  M  and  draw  the  semicircle  &  with  the 
center  M  and  the  radius  MC  =  MV.  The  end  points  S0  and  T0  of  the  semicircle, 
which  lie  on  the  line  AqB0,  we  designate  in  such  manner  that  S()  V  and  T()C 
become  sides  (not  diagonals)  of  the  chord  quadrilateral  S0  T0CV.  We  then  choose 
CS0  as  the  axis  and  CT0  as  the  perpendicular  to  the  axis.  On  the  axis  we  make 
CS\  =  VS0,  on  the  perpendicular  to  the  axis  CTX  =  VT0,  and  we  draw  the  line 
S1A1T1B1  ~  S0d07y?0.  Finally,  we  draw  parallel  to  SlA1TlBl  the  line  SATB  of 
which  S,  A,  T,  B  lie  on  the  perpendiculars  through  S0,  A0,  T0,  B0,  respectively, 
while  at  the  same  time  A  lies  on  CAX  and  B  lies  on  CBX . 

If  we  rotate  the  triangle  ABC  about  CS  as  the  axis  of  rotation  by  the  angle 
whose  cosine  g  =  C0T0/CT  as  the  angle  of  rotation,  A0B0C0  then  appears  as  the 
normal  projection  of  the  rotated  triangle  ABC,  which  possesses  the  prescribed 
form. 

That  the  ratio  g  =  C0T0CT  can  be  considered  as  a  cosine,  i.e.,  is  a  proper 
fraction,  is  shown  as  follows.  According  to  the  ray  theorem,  CT  =  CTX  ■ 
(CS/CSX),  i.e.,  according  to  the  construction,  =  VT0  ■  CS/VS.  If  we  introduce  this 
value  into  the  equation  for  g,  we  obtain 


However,  since,  according  to  the  theory  of  Ptolemy,  in  the  chord  quadrilateral 
STqCV  the  product  CT0  ■  VS  of  the  opposite  sides  is  smaller  than  the  product  CS  ■ 
VT{)  of  the  diagonals,  pi  represents  a  proper  fraction. 

This  proves  the  auxiliary  theorem  concerning  the  prism. 

The  proof  of  the  Pohlke- Schwarz  theorem  is  now  easy.  We  can  state  the 
theorem  in  the  following  manner: 

The  oblique  image  of  a  given  tetrahedron  can  always  be  determined  in  such 
manner  that  it  is  similar  to  a  given  quadrilateral. 

Let  the  tetrahedron  be  ABCS ,  the  quadrilateral  A  'B  'C'D'. 

In  the  affinity  between  the  planes  ABC  and  A'B'C',  in  which  A',  B',  C'  are 
correlated  to  the  points  A,  B,  and  C,  respectively,  let  the  point  D  correspond  to 
the  point  D'.  We  select  SD  as  the  direction  of  the  affinity  (projection  ray). 

We  construct  the  triangular  prism  whose  edges  are  parallel  to  SD  through  A, 

B,  and  C,  and  determine  the  section  A  'B  "C"  that  is  parallel  to  AB'C'. 

In  the  affinity  in  which  the  points  A",  B”,  C"  are  correlated  to  the  points  A'  B' , 

C,  let  the  point  D"  correspond  to  the  point  D'.  Then  A'B"C"D"  is  similar  to  AB 
'C'D'.  Now,  since  A"B"C"D'  and  also  ABCD  are  affine  with  respect  to  A'B'C'D', 
then  A  "B  "C"D  "  is  also  affine  to  ABCD. 

The  latter  affinity,  however,  arises  from  the  projection  rays  parallel  to  SD.  In 
this  affinity  the  quadrilateral  A"B"C"D"  that  is  similar  to  A'B'C'D'  is  thus  the 
oblique  image  of  the  given  tetrahedron  ABCS. 


74 


Gauss’  Fundamental  Theorem  of  Axonometry 


Though  three  segments  OA,  OB,  OC  originating  from  a  point  0  in  the 
drawing  plane  (image  plane)  all  three  of  which  do  not  belong  to  the  same 
straight  line  can  always,  according  to  Pohlke ’s  fundamental  theorem  (No.  73),  be 
considered  as  an  oblique  projection  of  a  tripod,  this  is  no  longer  the  case  for  the 
normal  projection  of  a  tripod.  Moreover,  there  exists  between  the  lengths  and 
directions  of  the  normal  projections  OA,  OB,  OC  of  the  three  legs  a  definite 
relationship.  Thus  we  come  to 

Gauss’  problem:  What  is  the  relation  between  the  normal  projections  OA, 
OB,  OC  of  the  legs  of  a  tripod? 


Solution.  We  select  the  image  plane  E  as  the  xy-plane,  the  perpendicular  to 
this  plane  from  the  apex  of  the  tripod  as  the  z-axis  of  a  triaxial  orthogonal 
coordinate  system;  we  take  the  common  length  of  the  three  legs  as  the  unit 
length  and  call  the  direction  cosines  of  the  legs  AjA'|  A*.  /x\n'\n",  and  v\v'\v\  At  the 
same  time  we  take  the  xy-plane  as  the  Gauss  plane  (the  plane  of  complex 
numbers)  and  designate  the  complex  number  represented  by  any  point  (P)  of  E 
by  the  corresponding  small  gothic  letter  (t>). 

Since  the  three  points  A,  B,  C  'mE  have  the  coordinates  a] A',  n\f,  v|v', 

a  =  A  +  iA',  b  =  p  +  ip!  y  c  =  v  +  *V. 

Squaring  and  adding,  we  obtain 


a2  +  b2  +  c2  =  (A2  +  A2  +  »*)  -  l*'3  +  fx'2  +  v'2] 

+  2«{AA'  +  [xfA  +  w'}. 

According  to  the  well-known  relations  between  the  direction  cosines  of  three 
mutually  perpendicular  lines,  the  expression  within  parentheses  and  the 
expression  within  brackets  both  equal  one,  while  the  expression  within  the 
braces  is  equal  to  zero.  This  gives  us  the  Gauss  equation 


a2  +  b2  +  c2  =  0. 


This  formula  forms 

Gauss’  fundamental  theorem  of  normal  axonometry:  If  in  the  normal 
projection  of  a  tripod  the  image  plane  is  considered  as  the  plane  of  complex 
numbers,  the  projection  of  the  apex  of  the  tripod  as  the  null  point,  and  the 
projections  of  the  leg  ends  as  complex  numbers  of  the  plane,  the  quadratic  sum 
of  these  numbers  is  equal  to  zero. 

The  Gauss  theorem  immediately  provides  the  solution  of  the 

Fundamental  problem  of  normal  axonometry:  TO  complete  the  normal 
projection  OABG  of  a  tripod  of  which  the  normal  projections  OA  and  OB  of  two 
of  the  legs  are  already  drawn. 

Solution.  We  select  (as  above)  the  point  0  as  the  null  point  of  the  complex 
number  plane  and  the  direction  of  OA  as  the  direction  of  the  positive  real  number 
axis.  The  magnitudes  of  the  three  numbers  a,  b,  c  we  will  designate  as  a,  b,  c, 
and  the  three  angles  BOC,  CO  A,  AOB  as  a,  f>,  y. 

We  write  the  Gauss  equation 


In  order  to  construct  p  =  ,a<  we  lay  off  at  O  on  OB  the  angle  y,  at  B  on  BO 

the  angle  OAB;  the  point  of  intersection  P  of  the  free  legs  of  the  angle  drawn 
gives  us  P.  We  then  draw  through  A  the  parallel  to  OP,  through  P  the  parallel  to 
OA  and  obtain  at  the  point  of  intersection  Q  of  the  two  parallels  the  complex 
number  q  =  a  +  (b1 2/a).  Consequently,  the  end  point  R  of  the  extension  of  QO  by 
itself  is  the  number  t  =  c2/a.  From  c  =  Vrtv  it  follows  that: 


B 


1 .  The  magnitude  of  c  is  the  mean  proportion  of  the  magnitudes  of  a  and  r; 

2.  the  direction  of  c  is  the  direction  of  the  bisector  of  the  angle  (2/3)  enclosed 
between  OA  and  OR. 

Accordingly,  we  bisect  the  angle  AOR  and  mark  off  on  the  bisector  from  0 
the  mean  proportion  of  OA  and  OR;  the  end  point  of  the  marked-off  segment  is 
the  sought-for  point  C.  Since  we  can  choose  the  bisector  of  the  concave  angle 
AOR  as  well  as  that  of  the  convex  angle  (in  accordance  with  the  two  values  of 
),  there  are  two  possible  positions  for  C. 

Note.  Weisbach’S  Theorem.  Since  the  square  of  a  complex  number  has 
an  angle  twice  as  great  as  the  number  itself,  the  vectors  of  the  squares  of  two 
complex  numbers  form  with  each  other  an  angle  that  is  twice  as  great  as  the 
vectors  of  the  numbers.  Thus  the  vectors  of  the  squares  a2  b2-c2  form  the  angles 


2a,  2f,  2 y  with  each  other.  Thus,  if  we  group  these  vectors  (by  magnitude  and 
direction),  we  obtain  (in  accordance  with  the  Gauss  formula)  a  triangle  with  the 
external  angles  2 a,  2/i,  2 y.  Since  the  sides  of  this  triangle  are  a2,  b2,  c2,  the  sine 
theorem  gives  us  the  equation 

a2:b2:ca  =  sin  2a :  sin  2/J:  sin  2y. 


This  formula  is 

Weisbach’s  theorem:  The  squares  of  the  normal  projections  of  the  legs  of  a 
tripod  relate  to  each  other  as  the  sine  of  twice  the  angles  enclosed  by  the 
projections. 

Thus,  Weisbach’s  theorem  appears  as  the  direct  consequence  of  the  Gauss 
theorem. 

The  Gauss  theorem  can  be  found  unproved  in  the  second  volume  of  Gauss’ 
Werke,  the  Weisbach  theorem  in  Weisbach’s  paper  on  axonometry,  which  was 
published  in  1 844  at  Tubingen  in  the  Polytechnische  Mitteilungen  of  Volz  and 
Karmarsch. 
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Hipparchus’  Stereographic  Projection 


To  present  a  conformal  map  projection  that  transforms  the  circles  of  the 
globe  into  circles  of  the  map. 


The  projection  we  are  looking  for,  which  is  called  a  stereographic  or  polar 
projection,  is  very  important  in  cartography.  In  all  probability  the  source  of  this 
problem  is  the  astronomer  Hipparchus  (of  Nicaea  in  Bithynia),  one  of  the  most 
amazing  men  of  antiquity,  who  was  making  astronomical  observations  in  the 
period  from  160-125  b.c.  in  Rhodes,  Alexandria,  Syracuse,  and  Babylon. 

The  problem  is  solved  by  the  following  projection  directive: 

One  selects  as  the  projection  plane  or  image  plane  (map  plane)  the  plane  E 
tangent  to  the  globe  at  an  appropriate  point  O — the  so-called  map  center — of  the 
area  to  be  projected,  and  as  the  center  of  a  central  projection  the  end  point  Z  of 
the  globe  diameter  OZ  originating  at  O.  The  stereographic  image  P'  of  an 
arbitrary  point  P  of  the  globe  is  the  point  of  intersection  of  the  projection  ray  ZP 
with  the  image  plane  E. 


z 


The  distance  r  =  OP'  from  the  map  center  is  given  by  the  equation 

r  =  2  tan  £, 

where  L,  represents  the  angle  formed  by  the  projection  ray  ZP  with  the  center  ray 
ZO ,  and  the  radius  of  the  globe  is  chosen  as  the  unit  length. 

The  stereographic  projection  thus  defined  has  the  following  two  properties: 

I.  Every  image  circle  of  a  globe  circle  is  a  circle. 

II.  The  stereographic  map  is  conformal.  (I.e.,  the  map  image  of  an  angle 
located  on  the  globe  is  an  equally  great  angle.) 

The  proofs  of  these  properties  are  both  based  on  the  following  auxiliary 
theorem: 

The  image  of  a  globe  tangent  bounded  by  globe  and  map  is  just  as  long  as  the 
tangent. 


i 


Proof  of  the  auxiliary  theorem.  Let  P  be  a  point  on  the  globe,  P'  its 
image,  M  the  place  at  which  the  globe  tangent  passing  through  P  and  lying  in  the 
drawing  plane  ZOP  meets  the  image  plane,  and  at  the  same  time  (since  the  two 
tangents  MO  and  MP  are  equal)  the  midpoint  of  the  hypotenuse  of  the  right 


triangle  OPP'.  The  intersection  point  D  of  any  other  globe  tangent  passing 
through  P  with  the  image  plane  will  then  he  perpendicularly  above  (below)  M. 
The  image  D'  of  D  is  D  itself,  and  the  image  of  the  tangent  DP  is  thus  DP'.  Now 
the  two  right  triangles  at  M,  DMP  and  DMP',  are  congruent  (MD  =  MD  and  MP 
=  MP').  Consequently,  D'P’  =  DP,  which  was  to  be  proved. 

Proof  of  I.  We  will  now  prove  the  somewhat  more  general  Chasles 
theorem:*  The  stereographic  image  of  a  globe  circle  ft  is  a  circle  whose 
midpoint  is  the  stereographic  projection  S'  of  the  apex  S  of  the  cone  that  is 
tangent  to  the  globe  along  the  circle  ft . 

Proof.  In  Figure  90  let  P  be  an  arbitrary  point  of  ft  P'  be  its  image,  D  the 
point  of  intersection  of  the  tangent  to  the  sphere  and  cone-generator  SP  with  the 
image  plane  E.  According  to  the  auxiliary  theorem,  DP  then  equals  DP'.  Thus,  if 
H  is  the  point  of  intersection  of  the  parallel  through  S'  to  DP  with  the  projection 
ray  ZP,  it  follows  from  the  similarity  of  the  triangle  S'P'H  to  the  isosceles 
triangle  DP'P  that  the  two  segments  S'P'  and  S'H  are  equal.  Consequently,  in  the 
relation 


S'H:SP  =  ZS.ZS 

derived  from  the  ray  theorem,  we  can  replace  S'H  with  S'P',  obtaining 


S'P' 


Now,  if  P  describes  the  circle  ft ,  SP  (as  the  distance  of  the  apex  S  of  the  cone 
from  ft)  remains  constant,  and  consequently,  in  view  of  the  last  equation,  S'P' 
also  remains  constant  and  P'  describes  a  circle  in  E. 

If  the  object  circle  ft  is  a  great  circle  of  the  globe,  the  apex  S  of  the  cone  lies 
at  infinity. 

In  this  case  let  F  be  the  place  at  which  the  perpendicular  from  Z  on  the  plane 


of  ft  touches  the  map  plane  E,  and  let  Fbe  the  place  at  which  the  globe  tangent 
through  P  parallel  to  this  perpendicular  touches  the  map  plane  E.  Since, 
according  to  the  auxiliary  theorem,  VP'  =  VP,  the  triangle  VPP'  is  isosceles;  and 
since  VP  is  parallel  to  FZ,  the  triangle  FZP'  is  also  isosceles;  therefore, 

FP'  =  ZF. 

The  locus  of  the  image  point  P'  is  thus  a  circle  with  the  midpoint  F  and  the 
radius  ZF. 

In  those  great  circles  of  the  globe  that  pass  through  the  projection  center  and 
the  map  center,  the  midpoint  F  of  the  image  circle  recedes  to  infinity.  In  fact, 
these  circles,  as  direct  inspection  will  show,  are  transformed  into  straight  lines  by 
projection. 

Proof  of  II.  Let  co  be  an  arbitrary  angle  on  the  globe,  its  apex  P,  therefore,  a 
point  on  the  globe,  and  each  of  its  legs  a  globe  tangent.  If  X  and  Y  are 
accordingly  the  points  at  which  the  two  tangents  intersect  the  image  plane  E, 
then  co  =  Z.XPY. 

The  image  co'  of  this  angle  is  the  angle  XP'Y. 

Now,  since  the  triangles  XPY  and  XP'Y  are  congruent  {XY  =  XY\  also, 
according  to  the  auxiliary  theorem,  XP  =  XP'  and  YP  =  YP'),  we  immediately 
obtain 


Cl/  =  OJy 


which  was  to  be  proved. 


FIG.  91. 


Note.  If  instead  of  the  tangential  plane  E  we  choose  a  plane  parallel  to  it  as 
our  map  plane,  we  obtain  a  similar  stereographic  projection,  which,  naturally, 
also  possesses  the  fundamental  properties  I.  and  II.  Of  particular  importance  is  a 
picture  plane  passing  through  the  center  of  the  globe,  especially  when  the  north 


pole  is  chosen  as  the  projection  center  and  the  equatorial  plane  is  accordingly 
chosen  as  the  image  plane.  In  this  case  we  obtain  for  the  distance  r  of  the  image 
point  P'  from  the  map  center  0  lying  at  the  center  of  the  globe  the  formula 

r  =  tan  ^45°  + 

where  cp  is  the  geographic  latitude  of  the  point  P.  (The  above  cited  angle  C,  =  & 
OZP  is  the  base  angle  of  the  isosceles  triangle  OPZ  in  which  the  apex  angle 
situated  at  O  is  the  complement  of  the  latitude  (p.) 
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The  Mercator  Projection 


To  draw  a  conformal  geographic  map  whose  grid  is  composed  of  right-angle 
compartments. 


The  Mercator  map,  which  is  equally  important  for  both  geography  and 
nautical  science,  was  conceived  by  Gerhard  Kremer,  called  Mercator  (151 2— 
1594). 

On  the  Mercator  map  the  equator  is  a  segment  AB,  the  length  of  which  agrees 
with  the  length  (2k)  of  the  globe  equator.  If  we  divide  AB  into  360  equal  parts 
and  erect  at  the  dividing  points  perpendiculars  to  AB,  we  thereby  obtain  the  map 
meridians.  The  latitude  parallel  on  the  map  that  corresponds  to  the  globe  parallel 
of  latitude  (p  is  a  line  parallel  to  AB  whose  distance  O  from  the  map  equator  is 
called  the  exaggerated  latitude.  The  core  of  the  problem  consists  of  representing 
the  exaggerated  latitude  <E>  as  a  function  of  the  geographic  latitude  (p. 

In  order  to  solve  this  problem  we  will  compare  the  Mercator  map  with  the — 
also  conformal — Hipparchus  map  (No.  75),  in  which  the  north  pole  of  the  globe 
is  the  projection  center  and  the  plane  E  of  the  globe  equator  is  the  map  plane, 
and  in  which,  therefore,  the  globe  equator  is  projected  isometrically.  Here  also 
the  globe  radius  will  serve  as  the  unit  length. 

On  the  Mercator  map  we  divide  the  distance  O  of  the  latitude  parallel  from 
the  equator  into  n  equal  parts,  where  n  is  a  very  large  number;  we  draw  through 
the  dividing  points  the  latitude  parallels  1,  2,  3,...,  n  -  1  and  call  their 
corresponding  geographic  latitudes  cpx,  (p2,---,  (pn-b  so  that  instead  of  (p  we  write 
(pn  also.  We  then  draw  the  two  parallel  map  meridians  k'  and  A'  corresponding  to 
the  globe  meridians  X  and  A,  whose  difference  in  longitude  measured  in  radian 


measure  e  =  A  -  X  we  will  make  very  small.  We  thereby  obtain  on  the  map  a 
series  of  successive,  very  small,  congruent  rectangles  with  the  base  line  e  and  the 
altitude  O In. 

We  now  do  the  same  on  the  Hipparchus  map.  Thus,  we  draw  the  concentric 
map  latitudes  corresponding  to  the  latitudes  cpx,  cp2,..  .  (pn_ \  and  call  their  radii 

rlfcr3,  ...,rn  =  r.  According  to  No.  75, 


(1) 


r  =  tan 


Similarly,  we  draw  the  map  meridians  X'  and  A'  corresponding  to  the  two 
longitudes  X  and  A;  these  meridians  are  at  the  same  time  the  radii  of  the  circle  of 
latitude  of  radius  r.  Thus,  we  obtain  on  the  Hipparchus  map  a  series  of  n 
successive,  very  small  compartments,  which  we  can  consider  as  rectangles  if  n  is 
sufficiently  great.  We  single  out  the  compartment  situated  between  the  latitude 
circles  of  radii  rv  and  rv+  x.  Since  its  base  line  parallel  to  the  map  equator  is  rv 

times  as  great  as  the  base  line  e  of  the  first  compartment,  and  thus  also  rv  times 
as  great  as  the  base  line  e  of  the  compartment  of  the  Mercator  map,  then  as  a 
result  of  the  conformal  nature  of  the  two  maps,  the  altitude  rv  +  x  -  rY  of  the 
Hipparchus  map  compartment  must  also  be  rv  times  as  great  as  the  altitude  <D/« 
of  the  corresponding  compartment  of  the  Mercator  map: 


X  A1 


e 


FIG.  92. 


From  this  it  follows  that 


r**I 


If  we  construct  this  equation  for  all  n  compartments,  r0  being  equal  to  1 ,  and 
multiply  the  resulting  n  equations  together,  we  obtain 


(2) 


However,  since  for  sufficiently  great  n  the  right  side  of  this  equation  does  not 
deviate  noticeably  from  e®  (No.  12),  we  obtain  the  equation 


(2a)  r  = *  *•. 

From  this  we  get  ®  =  Ir  or,  because  of  (1), 

(3)  =  /  tan  ^45°  + 

and  thus  the  exaggerated  latitude  O  is  represented  as  a  function  of  the 
geographic  latitude  cp. 

As  a  result  of  our  investigation  we  obtain  the  following 
Directive  for  drawing  a  Mercator  map:  The  map  image  of  a  point  on  the 
earth  of  longitude  X  and  latitude  cp  has  a  distance  X  from  the  zero  meridian  on 
the  map  and  a  distance  of 


l  tan 


from  the  map  equator. 

Here  the  angles  X  and  (p  are  taken  as  being  in  radian  measure  and  the  radius  of 
the  globe  on  which  the  map  is  based  is  taken  as  the  unit  length. 


*  The  scalar  product  of  two  vectors  ^  and  ^  is  most  conveniently  written  ^  y  or  in  the  still  simpler 
form  ^  . 

*  Michel  Chasles  (1793-1880),  French  mathematician,  especially  well-known  for  his  brilliantly  written 
Apergu  historique  sur  l  ’origine  et  le  developpement  des  methodes  en  geometrie. 


Nautical  and  Astronomical  Problems 
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The  Problem  of  the  Loxodrome 


To  determine  the  longitude  of  the  loxodromic  line  joining  two  points  on  the 
surface  of  the  earth. 

A  loxodrome  is  understood  to  mean  a  line  on  the  earth’s  surface  that  makes 
the  same  angle  with  all  the  meridians  that  it  cuts.  As  long  as  a  ship  does  not  alter 
its  course  it  is  sailing  on  a  loxodrome.  The  angle  k  formed  by  the  loxodrome 
with  the  meridians  it  cuts  is  therefore  called  the  azimuth  of  course.  On  a 
Mercator  map  (No.  76),  which  is  conformal  and  possesses  rectilinear  parallel 
meridians ,  the  loxodrome  appears  as  a  straight  line  that  cuts  the  map  meridians 
at  the  angle  k. 

In  our  study  of  the  Mercator  map  we  chose  the  radius  of  the  globe  as  the  unit 
length.  Sailors  use  as  the  unit  length  the  nautical  mile  (nm),  which  is  the  length 
of  one  minute  latitude  on  a  meridian  of  the  earth’s  surface  or,  also,  the  length  of 
a  minute  longitude  on  the  equator  (each  being  1852  meters).  Since  a  meridian  is 
7 t  earth  radians  long  and  180  degrees  of  latitude  is  equal  to  10800  latitude 
minutes,  the  earth  radius  is  n  =  1 0800/7T  nm  long.  If  we  think  of  a  Mercator  map 
with  1:1  scale  (i.e.,  a  map  whose  equator  is  as  long  as  the  real  equator),  the 
distance  between  the  map  circle  corresponding  to  the  latitude  (p  and  the  map 
equator,  the  so-called  exaggerated  latitude  (according  to  No.  76),  is 

<I>  =  nl  tan  ^45°  +  nm. 

The  two  earth  points  O  and  O'  whose  loxodromic  distance  d  is  to  be 
determined  are  given  by  their  longitudes  X,  X'  and  latitudes  2,  2'  (>(p). 

The  exaggerated  latitudes  on  the  map  are 

<t>  =  nl  tan  |45°  +  and  <!>'  =  nl  tan  ^45°  +  nm> 

the  distances  of  the  map  meridians  from  the  zero  meridian  A  and  A'  nm,  where 
A  represents  the  number  of  longitude  minutes  comprising  X  and  A'  the  number 
of  longitude  minutes  comprising  X'. 

Let  us  say  that  the  map  meridian  through  O  and  the  map  parallel  through  O' 
intersect  at  S.  Then  OS  =  B  is  the  exaggerated  latitude  difference  O'  -  O,  O'S  =  L 
=  A'  -  A  (nm),  00'  is  the  map  loxodrome  and  l.O'OS  =  K  is  the  azimuth  of 


course. 


From  the  right  map  triangle  00  'S  we  find  the  azimuth  of  course  k  by  means 
of  the  equation 


In  order  to  determine  the  loxodromic  distance  d  of  the  two  positions  on  the 
surface  of  the  earth  we  divide  d  into  N  very  small  equal  segments  e  considered 
as  being  rectilinear.  If  we  draw  the  meridian  through  one  of  two  adjacent 
division  points  and  the  circle  of  latitude  through  the  other,  we  obtain  thereby  a 
very  small  right  triangle  with  the  hypotenuse  e,  whose  meridional  leg  is  the 
latitude  difference  /i  (measured  in  nm)  of  the  two  division  points  and  forms  the 
angle  k  with  the  loxodrome,  so  that  /?  =  e  cos  k.  Every  two  adjacent  points  thus 
possess  the  same  latitude  difference  [1.  The  total  (measured  in  nm)  latitude 
difference  b  of  the  two  positions  O  and  O'  on  the  earth’s  surface  is  therefore  b  = 
Nfl  =  Ne  cos  k  =  d  cos  k.  Consequently,  the  sought-for  loxodromic  distance  is 

(2)  d  =  b  sec  k. 

Formulas  (1)  and  (2)  contain  the  solution  to  the  problem. 

Example.  How  great  is  the  loxodromic  distance  from  Valdivia  (X  =  286° 
34.9'  E ,<p  =  -39°  53. E)  to  Yokohama  (X'  =  139°  39.2'  E ,  <p  =  +35°  26.6(p)?  Here 
the  longitudinal  difference  L  =  8815.7  minutes;  the  latitudinal  difference  b  = 
4519.7  minutes  or  nautical  miles;  the  exaggerated  latitude  difference  B  =  O'  -  O 
=  4890  nm;  k,  according  to  (1),  is  60°  58'  50";  and  the  loxodromic  distance  d 
according  to  (2),  is  9317  nm. 

Note.  The  shortest  distance  k  between  the  two  positions  can  be  found  by 
applying  the  cosine  theorem  to  the  spherical  triangle  NVY  (North  Pole-Valdivia- 
Yokohama).  In  this  triangle  NV=  90°  -  cp  =  129°  53. E,  NY  =  90°  -  <p',  -VNY=  X 
-  X',  and  VY=  k. 

According  to  the  cosine  theorem 

cos  k  =  cos  NV  cos  NY  +  sin  NV  sin  NY  cos  (A  —  A') 


or 


cos  k  —  sin  <p  sin  <p'  +  cos  <p  cos  <p'  cos  (A  —  A'). 


This  yields 


k  =  153°  36.1'  -  9216.1'  =  9216.1  nm. 


The  shortest  distance  is  consequently  101  nm  shorter  than  the  loxodromic 
distance. 

The  name  loxodrome  stems  from  the  Dutchman  Willebrord  Snell  (Snellius, 
1581-1626).  The  Portuguese  mathematician  Pedro  Nunes  (1492-1577)  was  the 
first  to  recognize  that  the  loxodromic  line  connecting  two  points  of  the  earth’s 
surface  is  not  the  shortest  connecting  line  and  that  a  loxodrome  continuously 
approaches  the  pole  without  ever  reaching  it. 


78 


Determining  the  Position  of  a  Ship  at  Sea 


One  of  the  most  important  problems  in  nautical  science  is  that  of  determining 
the  position  of  a  ship  at  sea.  The  solution  is  usually  obtained  by  the  method  of 
the  so-called  astronomical  meridian  reckoning ,  which  will  be  analyzed  in  the 
following  example. 

Problem:  On  board  a  ship  in  the  Pacific  Ocean  in  the  north  latitude  on 
October  20,  1923  at  6:50  p.m.  mean  Greenwich  time  by  the  chronometer  the  sun  s 
altitude  was  taken  in  the  morning  as  h  =  21°  40.5';  the  Nautical  Almanac  gave 
the  declination  of  the  sun  for  the  time  of  observation  as  S  =  10°  10.2'  S,  the 
equation  of  time  as  e  =  -  15  min  3  sec.  The  ship  then  sailed  till  noon  15.2  nm 
WNW,  and  the  altitude  of  the  sun  at  zenith  was  then  measured  as  H  =  35°  2.7' 
and  the  sun  s  declination  determined  at  A  =  10°  13'. 

Where  was  the  ship? 

The  solution  to  this  problem  consists  of  four  steps. 

I.  Determination  of  the  meridional  latitude  O.  At  culmination  the 
successive  arcs — the  altitude  of  the  sun,  the  pole  distance,  the  pole  altitude — 
cover  the  meridional  half  circle  above  the  horizon  in  such  manner  that  H  +  (90° 
+  A)  +  O  =  180°.  This  gives  us 


<I>  =  90°  -  //  -  A  =  44°  44.3'. 


11.  Determination  of  the  latitude  difference  p  and  the  longitude 

DIFFERENCE  /  OF  THE  TWO  OBSERVATION  POINTS,  AS  WELL  AS  THE  A.M.  LATITUDE  <p. 

If  one  imagines  two  sufficiently  close  points  A  and  B  on  the  earth’s  surface, 
the  distance  between  which  is  d  nm  and  the  line  connecting  which  forms  the 
angle  k  with  the  longitudinal  circle  passing  through  the  center  M  of  AB,  then  the 
latitudinal  difference  of  the  two  points  is  d  cos  k  nm,  the  longitudinal  difference 


d  sin  k  nm.  Since  one  nautical  mile  of  latitudinal  difference  is  equivalent  to  one 
minute  latitude  difference  and  one  nautical  mile  longitudinal  difference  at  the 
latitude  <p  corresponds  to  sec  cp  minutes  longitudinal  difference,  then  the 
latitudinal  and  longitudinal  differences  of  A  and  B  in  minutes  are: 

P  =  d  cos  Kt  l  —  d  sin  k  see  /*, 

where  /i  is  the  latitude  of  M,  the  so-called  mean  latitude  of  A  and  B. 

In  our  example  {d  =  15.2,  k  =  67.5°)  we  find  first  that 

ft  =  5.8\ 

From  this  it  follows  that  the  a.m.  latitude  is 


9  =  d>  -  p  =  44°  38.5', 


and  the  mean  latitude  is 


FIG.  93. 

Accordingly  we  find  the  longitude  difference  to  be 

/  =  19.75', 

III.  Determination  of  the  a.m.  longitude  x. 

In  the  formula  (see  Figure  93)  corresponding  to  the  nautical  triangle  PZO 
(pole-zenith-sun)  of  the  a.m.  observation 


cos  z  =  cos  p  cos  b  +  sin  p  sin  b  cos  ZPO, 


we  replace  z,p,  b,  and  ZPO  with  90°  -  h,  90°  +  S,  90°  -  (p,  and  180°  -  T  (T 
being  understood  to  represent  the  time  angle  of  the  sun),  and  we  obtain 

_  •  sin  A 

—  cos  /  =  tan  5  tan  ®  + - r - 

cos  o  cos  <p 

This  yields  the  true  local  time  T  of  the  a.m.  observation 

T.L.T.  =  T  =  134°  47.5'  =  8  hr  59  min  10  see. 

From  this  and  the  time  equation  e  we  obtain  the  mean  local  time  of  the 
observation 


M.L.T.  =  T.L.T.  +  e  =  8  hr  44  min  7  see. 

If  we  reduce  the  mean  Greenwich  time  of  the  observation  by  the  mean  local 
time,  we  obtain  the  western  longitude  X  of  the  observation  point  in  time: 

A  -  M.G.T.  -  M.L.T.  =  10  hr  5  min  53  sec. 

In  angular  measure  (1  hr  time  longitude  =15  degrees  longitude),  this  comes  to 

A  =  151°  28.25' W. 

IV.  Determination  of  the  meridian  longitude  A. 

A  =  A  -I-  /  =  151°  48'. 

Result:  a.m.  Position:  44°  38.5'  N,  151°  28.25'  W,  Noon  Position:  44°  44.3' 
N,  151°  48' W. 


79 


Gauss’  Two-Altitude  Problem 


From  the  altitudes  of  two  known  stars  determine  the  time  and  position. 


This  problem,  which  is  very  important  for  astronomers,  geographers,  and 
mariners,  was  solved  by  Gauss  in  1812  in  Bode’s  Astronomisches  Jahrbuch. 

Two  stars  are  said  to  be  known  when  their  equatorial  coordinates — the  right 
ascension  and  declination — are  known.  Let  these  coordinates  of  the  two  stars  S 
and  S'  be  a\d  and  a'\8'.  In  the  present  problem  all  we  need  in  addition  is  the  right 


ascension  difference  a'  -  a.  In  the  figure  let  P  be  the  world  pole;  thus  PS  =  p  = 
90°  -  8  will  be  the  pole  distance  from  S;  PS'  =  p'  =  90°  -  S'  will  be  the  pole 
distance  from  S';  and  lSPS'  =  r  will  be  the  angle  between  the  hour  circles  of  the 
two  stars,  as  well  as  the  magnitude  of  the  right  ascension  difference;  let  Z  be  the 
zenith  of  the  observation  point,  so  that  PZ  =  b  =  90°  -  (pis  the  complement  of  the 
latitude  cp,  ZS  =  z  the  zenith  distance  from  S,  and  ZS'  =  z'  the  zenith  distance  from 
S',  the  last  two  being  as  well  the  complements  of  the  altitudes  h  and 
respectively. 

We  still  need  the  auxiliary  magnitudes  &PSS'  =  a ,  PS'S  =  o',  Z.PSZ  =  <//, 
ZSS'  =  £  ZPS=  t,  and  the  side  SS'  =  s. 


The  computation,  which  is  very  simple,  consists  of  three  steps  corresponding 
to  the  three  triangles  PSS',  ZSS',  PZS,  which  are  taken  up  in  that  order. 

I.  Triangle  PSS'.  The  angles  a  and  o'  are  determined  according  to  Napier’s 
formulas 


tan 


■  # 

a  +  a 

~~2 


cos 


/>'  +  /> 


sin 


P'  +P 


and  the  side  .v  is  determined  according  to  the  sine  formula 


sin  j;sin/>  =  sin  r:sin  a'. 


II.  Triangle  ZSS'.  The  angle  C  is  calculated  according  to  the  tangent  theorem 
for  the  half  angle: 


£  _  /sin  (S  -  z)  sin  (S  -  s) 

2  V  sin  2  sin  (X  -  2')  ’ 

where  £  is  half  the  sum  of  the  triangle  sides  z,  z',  s.  In  connection  with  this  we 
determine  if/ =  a -£ 

III.  Triangle  PZS,  determination  of  the  locale  and  the  time. 

The  sought-for  latitude  can  be  obtained  from 

cos  b  =  cos  p  cos  2  +  sin  p  sin  z  cos  tf> 


or 


sin  <p  =  sin  8  sin  h  +  cos  8  cos  h  cos  ip. 

The  sought-for  time  angle  T ,  i.e.,  the  angle  at  the  pole  that  has  been  described 
by  the  hour  circle  of  the  star  S  since  its  lower  culmination,  follows  from 

cos  z  —  cos  p  cos  b  sin  h  —  sin  8  sin  <p 

cos  t  =  - .  .  .  , -  =  - 5 - 

sin  p  sin  b  cos  o  cos  <p 


and 


T  =  12  hr  ±  /, 

where  the  upper  sign  applies  when  the  star  S  at  the  moment  of  observation  is  in 
the  western  celestial  hemisphere  and  the  lower  when  it  is  in  the  eastern  celestial 
hemisphere.  From  this  we  obtain  directly  the  sought-for  time — sidereal  time  0 
(the  time  angle  of  the  Aries  point) — of  the  observation  when  we  add  the  right 
ascension  a  to  the  time  angle  T:  0  =  T  +  a. 

In  order  to  obtain  the  mean  local  time — M.L.T. — of  the  observation  we  first 
determine  with  an  approximate  value  «0  of  the  right  ascension  of  the  mean  sun 
for  the  moment  of  the  observation  the  approximate  mean  local  time  0  -  s0  of 
the  observation;  then,  using  this  already  fairly  exact  mean  local  time  we 
determine  the  exact  right  ascension  a{)  of  the  mean  sun  for  the  moment  of 

observation  and  finally  the  exact  mean  local  time 


M.L.T.  =  3  —  a0. 


We  can  apply  this  solution  of  the  Gauss  two-altitude  problem  directly  to  the 
solution  of  the  very  important  navigational  problem, 

Douwes’*  problem:  From  two  altitudes  of  a  star  ( the  sun )  with  known 
declination  and  the  interval  between  the  two  observations  determine  the  latitude 
of  the  place  of  observation. 

We  need  only  consider  S  and  S',  respectively,  as  the  place,  S  and  S', 
respectively,  as  the  declination  of  the  star  at  the  first  and  second  observations. 
For  fixed  stars  8  =  S',  while  for  the  sun  and  the  planets  8’  differs  somewhat  from 
S.  (t  is  the  angle  determined  by  the  known  time  interval  between  the  hour  circles 
of  the  star  corresponding  to  the  two  moments  of  observation.) 

Since  the  two  measured  altitudes  are  usually  observed  at  different  places  A 
and  B,  while  the  above  calculation  is  related  to  only  one  place,  let  us  say  B,  the 
altitude  measured  at  A  must  be  “reduced  to  place  B  ”  For  this  purpose  we  solve 
the  problem: 

At  a  place  A  the  altitude  of  a  star  is  observed  at  a  given  time  3 ;  at  the  same 
moment  in  time  what  is  the  altitude  of  the  star  at  place  B? 

To  begin  with,  it  is  clear  that  all  places  on  the  earth’s  surface  at  which  the  star 
has  the  same  altitude  or  the  same  zenith  distance  at  moment  3  lie  on  a  circle  of 
the  geosphere  the  spherical  midpoint  of  which  is  the  end  point  S0  of  the  earth 

radius  from  the  geocenter  to  the  star.  This  circle  is  called  the  equal  altitude  circle 
of  the  star,  its  midpoint  S0  the  star  image. 


In  Figure  95  let  ^  and  <0  be  the  two  equal  altitude  circles  of  the  star  at 
moment  3  on  which  the  observation  points  A  and  B  lie;  let  S0  be  the  star  image, 
O  the  point  of  intersection  of  the  great  arc  S()A  with  53 .  We  will  assume  that  the 
distance  AB  is  so  small  that  the  triangle  A  OB  can  be  considered  plane.  This  gives 
for  the  difference  between  the  zenith  distances  and,  consequently,  also  for  the 


difference  in  the  altitudes  of  the  star  at  A  and  B 

AO  *  AB  cos  at, 

where  co  is  the  angle  between  the  ship’s  course  AB  and  the  bearing  AO  of  the  star 
at  A. 

We  accordingly  obtain  the  sought-for  star  altitude  h  at  B  at  the  time  3  of  the 
observation  made  at  A  if  we  increase  or  reduce  the  star  altitude  measured  at  A  by 
the  product  of  the  traversed  distance  AB  and  the  cosine  of  the  angle  between  the 
course  and  the  bearing  of  the  star  at  A,  accordingly  as  the  ship  draws  nearer  to  or 
recedes  from  the  star. 

The  “reduced”  altitude  thus  obtained  must  then  be  substituted  for  h  in  the 
above  Gauss  equation,  while  the  altitude  measured  at  B  must  be  used  for  h'. 

The  value  for  cp  obtained  by  this  calculation  is  naturally  the  latitude  of  the 
second  observation  point  B. 
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Gauss’  Three- Altitude  Problem 


From  the  time  intervals  between  the  moments  at  which  three  known  stars 
attain  the  same  altitude,  determine  the  moments  of  the  observations,  the  latitude 
of  the  observation  point  and  the  altitude  of  the  stars. 


The  significance  of  this  Gauss  method  for  determining  time  and  location 
resides  in  the  fact  that  it  eliminates  all  observational  error  resulting  from 
atmospheric  refraction. 

Solution.  We  designate  the  equatorial  coordinates  (right  ascension  and 
declination)  of  the  three  stars  as  aj8,  a'|8',  a'|8',  the  latitude  of  the  observation 
point  as  cp,  the  moments  of  the  observations  as  t,  t',  t',  the  time  angles  of  the  three 
stars  at  these  moments  as  T,  T,  T",  so  that  the  differences  T  -T=f  -t  and  T  - 
T=  t'  - 1  are  known.  This  gives  us  the  three  equations 


(1)  sin  h  =  sin  8  sin  9  —  cos  8  cos  9  cos  T, 

(2)  sin  h  =  sin  8'  sin  9  —  cos  S'  cos  9  cos  T\ 

(3)  sin  h  =  sin  8'  sin  9  —  cos  8'  cos  9  cos  T“. 


By  subtracting  the  two  first  equations  we  obtain 

(4)  sin  9(sin  8  —  sin  8')  =  cos  9(cos  8  cos  T  —  cos  S'  cos  T'). 


We  now  introduce  the  half  sum  and  half  difference 


s 


8'  +  S 
2 


and  u 


and 


S  = 


T'  +  T 
2 


and 


U  = 


T' 


-  T 
2 


of  the  declinations  8'  and  8  and  the  time  angles  T  and  T,  respectively,  and 
accordingly  replace  S'  and  8  in  (4)  by  s  +  u  and  s-u,  and  replace  T  and  T  by  S  + 
U  and  S-U.  In  the  transformed  equation  (4)  we  then  apply  the  addition  theorem 
throughout  and  obtain 

—  sin  <p  cos  s  sin  u  = 

cos  9>(sin  S  sin  U  cos  s  cos  u  +  cos  5  cos  U  sin  j  sin  u). 


Here  we  divide  by  cos  (p  cos  s  sin  u  and  obtain 

—  tan  <p  =  sin  S- sin  U  cot  u  +  cos  S- cos  U  tan  s. 

Since  U,  u,  and  s  are  known,  we  determine  the  auxiliary  magnitudes  r  and  w 
such  that 


r  cos  w  =  sin  U  cot  u  and  r  sin  w  =  cos  U  tan  s. 

(First  w  is  determined  from  tan  w  =  tan  .v  tan  u  cot  U  and  then  r  from  one  of  the 
two  auxiliary  equations.)  The  equation  obtained  then  assumes  the  simple  form 

(I)  —  tan  <p  =  r  sin  [5  +  w]. 

In  precisely  the  same  way,  by  subtracting  the  two  equations  (1)  and  (3), 
introducing  the  half  sums 


and  half  differences 


5'  +  S  _  T"  +  T 

2  ’  ®  2 


and  introducing  the  auxiliary  magnitudes  r  and  n>  determined  by  the  conditions 


r  cos  tt>  =  sin  U  cot  u, 


r  sin  to  =  cos  U  tan  3, 


we  find  the  equation 
(II)  —  tan<p  =  rsin  (3  +  tv). 


By  division  of  II  and  I  we  obtain  the  sine  ratio  of  the  two  unknown  angles  (§ 
+  ro )  and  [S  +  w]. 


(III) 


sin  (<5  +  to)  _  r 
sin  [S  +  w)  x 


However,  since  the  difference 


( <S  +  to)  -  [5  +  w]  =  - = -  +  to  —  w 


of  these  angles  is  known,  it  is  easy  to  calculate  the  sum  of  the  angles  by  applying 
the  sine  tangent  theorem  (No.  40)  to  (III).  From  the  sum  and  the  difference  we 
obtain  directly  the  angles  ®  +  n?  and  S  +  w  themselves  and  consequently  also 
the  unknown  angles 


_  T"  +  T 


V©  = 


and  S 


T'  +  T 
2 


From  S  and  the  known  difference  T  -  T  we  then  obtain  the  sought-  for  time 
angles  T  and  T;  from  3  and  the  known  difference  T  -  T  we  obtain  in  similar 
fashion  the  time  angles  T  and  T .  By  adding  the  right  ascension  to  the  time  angle 
we  finally  obtain  the  moments  of  the  observations  in  sidereal  time. 

The  sought-for  latitude  then  follows  from  (I)  or  (II),  the  sought-for  altitude  h 
from  (1),  (2),  or  (3). 

Note.  If  the  latitude  is  to  be  determined  from  two  observations  of  the  same 
star  altitude  and  the  time  interval  between  them,  we  have  at  our  disposal  only 
equations  (1)  and  (2)  and  must  assume  that  the  time  angle  T  for  one  of  the 
observations  is  known.  Equation  (I),  all  the  magnitudes  on  the  right  side  of 
which  are  known,  then  gives  (p. 

A  remarkable  special  case  of  this  situation  is  the 

Problem  of  Riggioli:  From  the  time  between  the  culminations  of  two  known 
stars  that  rise  or  set  at  the  same  time,  find  the  latitude  of  the  observation  point. 

This  problem  posed  by  Riccioli  in  1651  is  especially  noteworthy  in  that  the 
method  employed  makes  possible  determinations  of  latitude  without  an  angle- 


measuring  instrument. 

If  T  and  T  are  the  time  angles  of  star  risings,  their  difference  2U  =  T  -  T  is 
also  the  time  between  their  culminations.  Our  initial  equations  (1)  and  (2)  are 
simplified  here  (because  h  =  0)  to 

cos  T  —  tan  S  tan  (p  and  cos  T'  —  tan  S'  tan  <p. 

We  introduce  the  complements  r  and  z'  of  the  time  angles  and  obtain 

sin  r  =  tan  S  tan  <p,  sin  r'  =  tan  S'  tan  <p, 

and  from  this  by  division  we  get  the  sine  ratio  of  the  angles  z  and  z'\ 

sin  r:sin  r'  =  tan  5:  tan  S', 

Since  r  -  z'  =  T  -  T  is  known,  we  obtain  =  z  +  z'  from  this  equation,  in 
accordance  with  the  sine-tangent  theorem.  We  then  get  2r  =  (r  +  f)  +  (r  -  r') 
and  finally  (p  from  sin  r  =  tan  8  tan  (p. 
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The  Kepler  Equation 


From  the  mean  anomaly  of  a  planet  calculate  the  eccentric  and  true  anomaly. 

Johannes  Kepler  (1571-1630)  was  one  of  the  greatest  astronomers  of  all 
time.  The  famous  problem  named  after  him  is  to  be  found  in  the  60th  chapter  of 
Kepler’s  major  work  Astronomia  nova ,  published  in  Prague  in  1609,  a  book  that, 
according  to  Lalande,  every  astronomer  must  read  at  least  once. 

Before  taking  up  the  solution  we  will  present  a  short  explanation  of  the  three 
anomalies. 

Let  S  and  P  be  the  midpoints  of  the  sun  and  a  planet,  respectively,  let  N  be  the 
point  of  the  planet’s  orbit  at  which  the  planet  is  nearest  to  the  sun,  the  so-called 
perihelion,  let  O  be  the  midpoint  of  the  elliptical  orbit  and  of  its  circle  of 
circumscription,  P0  the  point  of  intersection  of  the  circle  of  circumscription  with 
the  parallel  drawn  through  P  to  the  minor  orbit  axis,  a  and  b  the  major  and  minor 
axes  of  the  ellipse,  respectively,  OS  =  e  the  linear  eccentricity,  s  =  e/a  the 
astronomic  eccentricity  or  form  number,  T  the  period  of  revolution  of  the  planet, 
and  t  the  time  elapsed  at  the  planet’s  position  P  since  its  passage  through  the 
perihelion. 


FIG.  96. 


The  true  anomaly  W  is  the  angle  NSP,  i.e.,  the  angle  described  by  the  focal 
radius  of  the  planet  in  the  time  t,  the  mean  anomaly  M  the  angle  that  the  focal 
radius  would  describe  in  the  time  t  if  it  were  to  revolve  uniformly  (with  the  same 
period  of  revolution  I ),  so  that  in  angular  measure 


M  =yt. 

Finally,  the  eccentric  anomaly  E  is  the  angle  NOP0  formed  by  the  radius  of 
the  circle  of  circumscription  to  P0  with  the  radius  of  the  circle  of  circumscription 
ON. 

With  E  as  a  variable  parameter  we  have 

x  =  a  cos  E,  y  =  b  sin  E  the  equation  of  the  orbit 

x  =  a  cos  E,  y0  =  a  sin  E  the  equation  of  its  circle  of 

circumscription. 

There  exists  between  the  eccentric  and  true  anomaly  the  relation  (obtainable 
from  the  right  triangle  with  the  legs  e  —  x  and  v) 


tan  W 


b  sin  E 
a  cos  E  —  e' 


after  squaring  and  use  of  the  formulas  b2  =  a2  -  e2,  e  =  as,  and  cos2  E  +  sin2  E  = 
1 ,  sec2  W  -  tan2  W  =  1 ,  this  relation  is  transformed  into 

cos  E  —  e 


cos  W 


1  —  e  cos  E 


In  order  to  obtain,  in  addition,  a  formula  that  is  convenient  for  logarithmic 
treatment,  Gauss  introduced  the  half  angles  $  W  and  \E  and  made  use  of  the 
formulas 


9 


+  cos  <p  —  2  cos2  ^ 


and 


1  —  cos  9  =  2  sin 


a  9 
2 


We  write  the  above  equation 

1  —  cos  W  _  1  +  e  1  —  cos  E 
1  +  cos  W  1  —  e  1  +  cos  E 


and  obtain  the 

Gauss  formula: 


There  exists  between  the  eccentric  and  mean  anomaly  (in  radian  measure)  the 
famous  Kepler  equation: 


E  —  e  sin  E  =  M. 

This  equation  is  a  consequence  of  the  formula 

J  =  y  (E  -  «  sin  £)* 

for  the  area  J  of  the  elliptical  sector  SNP  and  of  the  Kepler  surface  theorem: 
“The  focal  radius  of  a  planet  sweeps  equal  surfaces  in  equal  times.”  [According 
to  the  area  formula,  the  area  of  the  half  ellipse  (E  =  n)  is  \nab\  the  area  of  the 
whole  ellipse  is  thus  nab.  According  to  Kepler’s  surface  theorem,  there  exists  the 
proportion  J :  nab  =  t:  T.  Consequently,  E  -  e  sin  E  =  2nt :  T  =  M.] 

The  crux  of  the  Kepler  problem  now  consists  of  the  solution  of  the  Kepler 
equation 


E  —  e  sin  E  *■  M 

for  the  unknown  E  (when  M  and  s  are  assumed  to  be  known). 

The  following  determination  of  E  rests  upon  the  assumption  that  the  form 
number  s  is  a  proper  fraction  and  consists  in  the  calculation  of  a  series  Ex  E2, 
E3...  of  approximate  values  for  the  eccentric  anomaly  that  deviate  progressively 


less  and  less  from  the  true  value  E  as  the  index  number  increases  and 
approximate  the  true  value  sufficiently  closely  at  a  relatively  low  index  number. 
For  the  first  approximation  value  we  choose 

Ex  =  M  +  e  sin  M. 

Its  deviation  from  the  true  value  E  is 

E  —  Ex  —  e(sin  E  —  sin  M). 


However,  since 


sin  E  —  sin  M \  <  \E  —  M\  =  |e  sin  E\  <  e, 


it  follows  that 


|  E  -  Ex  |  < 

As  the  second  approximation  value  we  choose 

Ea  =  M  +  e  sin  Ev 

Its  deviation  from  E  is  E  -  E2  -  7r(sin  E  -  sin  E{).  However,  since 

|sin  E  —  sin  £x|  <  |£  —  Ex\ 

and  the  latter  magnitude,  as  was  just  shown,  is  <  n2,  it  follows  that 

| E  -  Ea\  <  e3. 

The  third  approximation  value  is 

E3  =  M  +  e  sin  E2. 

Its  deviation  from  E,  absolutely  considered,  is  <  e4,  etc. 

The  nth  approximation  value  deviates  from  the  true  value  by  less  than  the  (n 
+  1  )th  power  of  the  form  number  s.  The  approximation  values  accordingly 
approach  the  true  value  progressively  more  rapidly  as  s  diminishes. 

In  the  earth’s  orbit,  for  example,  e  =  0.01674,  e3  =  0.00000469,  arc  V  = 
0.00000485.  Consequently: 

For  the  earth’s  orbit  the  second  approximation  value  is  already  exact  to 


seconds! 

In  the  orbit  of  Mars,  which  has  the  fairly  high  form  number  of  0.0933,  e5  = 
0.0000071,  so  that  the  fourth  approximation  value  E  results  in  an  error  of  less 
than  2". 

After  E  is  determined  the  true  anomaly  is  calculated  by  the  Gauss  formula. 

Note.  Kepler’s  problem  is  of  the  greatest  importance  for  astronomy.  It 
forms  the  basis,  for  example,  for  the  determination  of  the  equation  of  time  for  a 
given  moment  of  time. 

[The  equation  of  time  is  conventionally  understood  to  be  the  difference 
between  mean  and  true  local  time  or  also  the  difference  between  the  right 
ascensions  a  and  a0  of  the  true  and  mean  sun: 

e  —  M.L.T.  —  T.L.T.  —  a  —  a0.J 

The  calculation  is  based  on  the  following  seven  steps: 

1.  Determination  of  the  right  ascension  a0  of  the  mean  sun  for  the  given 
moment  of  time  from  its  daily  increase  of  3  m  56.55536  s  and  its  value  for  a 
fixed  moment  of  time  (on  January  1,  1925,  at  midnight,  M.G.T.  was  a0  =  18  hr 

40  min  30  sec). 

2.  Calculation  of  the  mean  anomaly  M  according  to  the  (definition)  equation 
a0  =  M+  II,  where  II  is  the  longitude  of  the  true  sun  at  perigee.  (II  on  January  1, 
1925,  was  281°  39'  2"  and  it  increases  annually  by  1'  1.9".) 

3.  Determination  of  the  eccentric  anomaly  E  from  Kepler’s  equation  E  -  e  sin 
E  =  M  with  e  =  0.0 1 674. 

4.  Calculation  of  the  true  anomaly  W  from  the  Gauss  formula 

tan  \W  =  Jy~~~  1311 

5.  Determination  of  the  longitude  L  of  the  true  sun  according  to  the  equation 
L=W+  n. 

6.  Determination  of  the  right  ascension  a  of  the  true  sun  in  accordance  with 
the  equation  tan  a  =  cos  i  tan  L  obtained  from  the  astronomical  triangle  having 
the  hypotenuse  L  and  the  legs  a  and  3;  in  the  equation,  i  represents  the 
inclination  of  the  ecliptic. 

7.  Calculation  of  the  equation  of  time  e  from  e  =  a- a0. 

Example.  The  equation  of  time  for  the  2nd  of  December,  1925  at  4:00  p.m. 
Central  European  Time. 


«o  =  16  hr  43  min  44  see  =  250°  56',  M  =  329°  16'  1', 

Ex  «  328°  46'  38*,  E2  =  E  =  328°  46'  12*,  W  =  328°  16'  10', 
L  =  249°  56'  9*,  a  =  248°  17'  28*  =  16  hr  33  min  10  sec, 
e  =  —  10  min  34  sec. 
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Star  Setting 


Calculate  the  time  and  azimuth  of  setting  of  a  known  star  for  a  given  place 
and  day. 


Solution.  The  method  of  calculation  can  best  be  illustrated  by  a  numerical 
example.  Thus,  let  us  consider  a  more  definite  form  of  the  problem: 

On  the  31  st  of  December,  1932,  when  did  Saturn  set  in  Nordlingen,  Bavaria 
(9  =  48°  51.T,  X  =  10°  29.4')?  The  nautical  almanac  gives  the  following  data  for 
December  31,  1932  at  midnight,  mean  Greenwich  time:  right  ascension  of 
Saturn  a  =  20  hr  25  min  30  sec  (hourly  increase  =1.2  sec),  declination  of  Saturn 
d=19°  47.4'  S  (hourly  decrease  0.06'),  right  ascension  of  the  mean  sun  a0  =  18 
hr  36  min  50  sec  (hourly  increase  =  9.86  sec). 

At  the  moment  of  setting  the  star  is  already  in  reality  a  certain  distance  h 
below  the  horizon  (57V)  as  a  result  of  atmospheric  refraction.  The  horizontal 
refraction  h  can  be  set  at  an  average  of  35',  but  in  precise  measurements  special 
refraction  tables  must  be  consulted. 


It  follows  from  the  nautical  triangle  PZ*  (in  which  PZ  =  b  =  90°  -  (p 
represents  the  complement  of  the  latitude  cp,  P*  =  p  =  90°  +  S  the  pole  distance, 
Z*  =  z  =  90°  +  h  the  zenith  distance,  Z ZP*  =  t  the  hour  angle,  and  ZPZ*  =  a  the 
azimuth  of  the  star),  according  to  the  cosine  theorem,  that 


cos  z  =  cos  b  cos  p  +  sin  b  sin  p  cos  t. 


If  we  introduce  the  magnitudes  h,  cp,  S  here  instead  ofz,  b,p,  we  obtain 

*  sin  h 

cos  t  —  tan  ®  tan  5 - r- 

cos  <p  cos  o 

First  we  calculate  the  approximate  time  t  of  setting,  taking  for  the  moment  of 
setting  (5=19°  47.4'.  We  then  obtain  from  the  formula  we  have  found  (assuming 
h  =  35'),  t  =  66°  42.8'  =  4  hr  26  min  51  sec  and  for  the  time  angle  T  of  the 
moment  of  setting 


T  —  16  hr  26  min  51  sec. 

From  this  we  get  for  the  sidereal  time  §  (i.e.,  the  time  angle  at  the  vernal 
equinox)  the  approximate  value 

S  =  T  +  a  =  36  hr  52  min  2 1  sec, 

and  thus  for  the  mean  local  time  of  setting 

M.L.T.  =  3  —  a0  =  18  hr  15  min  31  sec 

and  for  the  mean  Greenwich  time 

M.G.T.  =  M.L.T.  -  (A  =  41  min  58  sec)  =  17  hr  33  min  33  sec. 

At  the  moment  of  setting,  then,  approximately  17.55  hr  have  gone  by  since 
midnight  mean  Greenwich  time.  In  these  17.55  hr  the  three  magnitudes  a,  S,  and 
a0  increase  by  21  sec,  -1.1',  2  min  53  sec,  so  that  at  the  moment  of  setting  they 

have  the  values 


a  a*  20  hr  25  min  51  sec,  8=19°  46.3', 
o0  =  18  hr  39  min  43  sec. 

The  calculation  must  now  be  repeated  with  these  exact  values.  This  gives 

T  *  16  hr  26  min  57  sec 
o  =  20  hr  25  min  51  sec 
S  —  36  hr  52  min  48  sec 
a0  =  18  hr  39  min  43  sec 
M.L.T.  =  18  hr  13  min  5  sec 
M.G.T.  =  17  hr  31  min  7  sec. 


The  sought-for  azimuth  a  is  computed  from  the  sine  formula 


sin  a: sin  /  =  sin />:sin  z 


and  comes  out  to  be 


a  =  120°  10'. 

Result.  Saturn  set  at  18  hr  31.1  min  C.E.T.  at  an  azimuth  of  S  59°  50'  W. 

Note.  The  method  described  is  naturally  just  as  well  suited  to  the 
determination  of  the  rising  time  or  the  time  at  which  a  star  attains  a  prescribed 
altitude.  If  it  is  specifically  desired  to  determine  the  moment  of  culmination,  the 
logarithmic  calculation  can  be  dispensed  with,  since  the  time  angle  of 
culmination,  T=  12  hr,  is  known. 
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The  Problem  of  the  Sundial 


To  construct  a  sundial. 

First  we  will  consider  the  two  simplest  forms  of  sundial:  the  horizontal  dial 
and  the  vertical  meridional  dial.  In  the  first  the  plane  of  the  dial  E  is  horizontal, 
in  the  second  vertical,  specifically  through  the  eastern  and  western  points  of  the 
horizon.  The  earth’s  axis  is  represented  by  a  pin,  the  gnomon  or  style  that  casts  a 
shadow  on  E.  At  noon  the  shadow  is  situated  at  its  center  position,  the  meridian 
line  of  the  dial  plane,  and  at  t  hr  before  or  after  noon  forms  the  “shadow  angle”  s 
or  a;,  respectively,  with  the  meridian  line. 

The  problem  is  to  determine  the  relation  between  the  time  t  and  the  shadow 
angle. 

We  will  call  the  plane  formed  by  the  sun  and  the  earth’s  axis  (the  gnomon) 
the  shadow  plane,  since  the  shadow  must  lie  in  this  plane.  At  noon  the  shadow 
plane  at  its  central  position  passes  through  the  north  and  south  points  of  the 
horizon  and  at  time  t  forms  the  angle  t  (?hr  =  15  A)  with  its  central  position. 


FIG.  98. 

In  the  figure  let  US,  UO,  and  UZ  be  segments  running  from  U  toward  the 
southern  point,  the  eastern  point,  and  the  zenith  of  the  horizon,  specifically  in 
such  manner  that  SZ  represents  the  gnomon;  thus  ZJJSZ  represents  the  latitude  cp 
of  the  place  and  SOZ  the  shadow  plane,  so  that  SO  is  the  shadow;  ZJJSO  is  the 
shadow  angle  s  of  the  horizontal  dial,  ZO  the  shadow,  ZJJZO  the  shadow  angle  o 
of  the  vertical  meridional  dial.  The  angle  t  between  the  shadow  plane  SOZ  and 
its  meridional  position  SUZ  is  the  angle  UFO  that  is  formed  with  UF  by  the 
perpendicular  OF  dropped  from  O  to  SZ.  If  we  select  SZ  as  the  unit  length  and, 
for  the  sake  of  brevity,  set  cos  (p  =  0,  sin  {(p  =  i,  it  follows  from  the  right  triangle 
SUZ  that  US  -  0,  UZ  -  i,  UF  =  oi,  from  the  right  triangle  UOF  that  UO  =  oi  tan  t, 
and  from  the  right  triangles  USO  and  UZO  that  UO=  0  tan  .s'  and  UO  =  i  tan  a.  If 
we  set  the  three  values  for  UO  equal  to  each  other,  we  get  the  equations 

(1)  tan  s  =  i  tan  t,  (2)  tan  a  =  o  tan  t, 

which  contain  the  sought-for  relations  between  the  time  t  and  the  shadow  angles 
s  and  a ,  respectively. 

In  order  to  construct  the  dial  we  compute,  in  accordance  with  (1)  or  (2),  the 
shadow  angle  corresponding  to  different  times  t,  draw  them  in,  but  write  on  their 
free  leg  not  s  or  a,  but  the  corresponding  times  t. 

It  is  also  possible  to  use  a  purely  graphic  method.  On  an  arbitrary  segment  AB 
we  begin  at  B  and  mark  off  i  or  0  times  its  length  to  C,  draw  the  semicircle  with 
the  center  C  and  the  arc  center  B,  and  draw  the  tangent  through  B  which  is  at  the 
same  time  perpendicular  to  AC. 


FIG.  99. 

If  we  now  make  the  arc  BT  equal  to  the  time  angle  t  (thus,  for  example,  45°  for  3 
hr),  extend  CT  to  the  intersection  J  with  the  tangent,  and  connect  J  with  A,  then 
2i BAJ=  co  is  the  shadow  angle  s  or  a  for  time  t.  [From  IABJA  it  follows  that  BJ  = 
BA  tan  co,  from  ABJC  that  BJ  =  BC  tan  t,  so  that  BA  tan  co  =  BC  tan  t  or,  since 
BC  is  i  or  0  times  BA, 


tan  w  —  i  tan  t  or  tan  w  —  o  tan  t. 

According  to  (1),  co  is  equal  to  s  and  according  to  (2),  co  =  a.] 

We  carry  out  the  described  construction  for  as  many  time  angles  t  as  possible 
and  obtain  the  dial  as  the  totality  of  lines  A  J  each  of  which  bears  written  on  it  its 
corresponding  time.  In  order  to  install  it,  we  place  the  drawing  plane 
horizontally,  so  that  BA  points  from  the  northern  point  of  the  horizon  to  the 
southern  point,  or  vertically,  so  that  BA  points  perpendicularly  upward  and  the 
tangent  runs  from  west  to  east,  and  fix  the  style  parallel  to  the  earth’s  axis  at  A. 

A  Vertical  Sundial  at  an  Arbitrary  Azimuth 

Let  us  now  consider  the  case  in  which  a  sundial  is  to  be  fastened  to  a  vertical 
house  wall  that  does  not  run  east  and  west. 

In  Figure  100,  let  UZ  be  a  vertical  line  on  the  wall  and  UH  a  horizontal  line 
on  the  wall,  US  a  horizontal  pointing  south,  ZS  the  gnomon,  so  that  ZJJSZ  =  (p 
and  .  UZS  =  b  =  90°  -  cp;  UZS  is  the  meridian  plane  and  zl SUH=  a  the  azimuth 
(calculated  from  the  south  point)  of  the  wall;  ZH  is  the  shadow  at  time  t,  so  that 
ZSH  is  the  shadow  plane,  and  the  angle  that  it  forms  with  the  meridian  plane 
ZSU  is  the  time  angle  t;  finally,  the  angle  that  ZH  forms  with  ZU  is  the  shadow 
angle  o.  The  three-dimensional  vertex  Z  with  the  edges  ZU,  ZH,  ZS  cuts  out  of 


the  sphere  with  the  center  Z  a  spherical  triangle  (shown  in  the  figure)  in  which 
the  side  o,  the  angle  a,  the  side  b,  and  the  angle  t  are  four  successive  elements. 
According  to  the  cotangent  theorem,  therefore, 


FIG.  100. 

cos  b  cos  a  —  sin  b  cot  a  —  sin  a  cot  t 


or 


cos  <p  cot  a  —  sin  a  cot  t  =  sin  <p  cos  a. 

This  is  the  relation  between  the  time  t  and  the  shadow  angle  o.  This  relation 
makes  it  possible  to  calculate  a  corresponding  a  for  every  t. 

The  invention  of  the  sundial  is  lost  in  antiquity.  A  statement  by  Vitruvius 
(which  was  also  found  engraved  on  an  ancient  sundial  unearthed  on  the  Via 
Flaminia),  according  to  which  the  inventor  is  the  Chaldaean  Berosus,  is  not 
reliable  in  view  of  the  fact  that  sundials  were  known  in  ancient  Babylonia  many 
centuries  before  Berosus. 
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The  Shadow  Curve 


To  determine  the  curve  described  by  the  shadow  of  a  point  of  a  rod  in  the 
course  of  a  day,  when  the  rod  is  erected  at  a  place  of  latitude  cp  and  the 


declination  of  the  sun  for  the  day  has  a  value  of  S. 

Solution.  We  select  the  perpendicular  from  the  point  of  the  rod  to  the 
horizon  of  the  place  as  the  unit  length  and  the  base  point  O  of  the  perpendicular 
as  the  origin  of  a  right-angle  coordinate  system  whose  x-axis  runs  toward  the 
north  point  and  whose  v-axis  runs  toward  the  west  point  of  the  horizon.  At  the 
moment  in  which  the  sun  (0)  has  the  azimuth  S  a°  E  and  the  zenith  distance  z, 
the  distance  of  the  shadow  from  O  is  tan  z,  and  the  abscissa  and  ordinate, 
respectively,  of  the  shadow  are 

x  —  tan  z  cos  a,  y  =  tan  z  sin  a. 

In  the  nautical  triangle  PZ0  the  latitude  complement  PZ  =  b  and  the  pole 
distance  P0  =p  =  90°  -  3  are  constant.  The  zenith  distance  Z0  =  z,  the  azimuth 
supplement  PZ 0  =  180°  -  a  and  the  hour  angle  ZP0  =  t  are  variable.  We  find 
the  equation  of  the  shadow  curve  by  expressing  sin  t  and  cos  t  in  terms  of  x  andy 
and  introducing  the  resulting  expressions  into  the  equation 

cos2  t  -f  sin3  1=1. 

We  abbreviate  sin  (p,  cos  cp,  and  tan  (p,  as  z,  0,  and  q,  respectively,  and  sin  p, 
cos  p,  and  tan  p,  as  /,  O,  and  Q,  respectively.  If  we  then  apply  to  the  nautical 
triangle  the  sine  theorem,  cosine  theorem,  and  cotangent  theorem  in  that  order, 
we  obtain  the  three  equations 

sin  a  sin  z  =  sin  p  sin  t, 

cos  z  =  cos  p  cos  b  +  sin  p  sin  b  cos  t, 

—cos  b  cos  a  =  sin  b  col  z  —  sin  a  cot  t. 

We  divid  the  first  by  the  second  and  obtain 

.  tan  p  sin  t 

sin  a  tan  z  =  - - - - 

sin  <p  +  cos  <p  tan  p  cos  t 


or 


(1) 


y  = 


Q  sin  t 
i  +  oQ  cos  t 


We  multifly  the  third  by  tan  z  and  obtain 


sin  9-cos  a  tan  z 


sin  a  tan  z-cot  t  —  cos  <p 


or 


(2) 


cos  t 


0. 


From  (1)  and  (2)  we  find 


n  ,  o  +  ix 
Q  cos  t  =  — 


i  —  ox 


Q  sin  /  = 


and  from  this,  in  accordance  with  what  was  stated  above,  we  obtain 


(o  +  ix)2  +  ya  =  Q2(i  —  ox)2 

as  the  equation  of  the  shadow  curve.  We  solve  for  y2  and  obtain 


y2  -  -  02)  -  2 io{Q2  +  1)*  +  (Q202  -  i2)x2 

or,  if  we  go  on  to  divide  by  02, 


=  (<?V  -  1)  -  2?(«J  +  l)«  +  (Q2  -  i,2)*2. 

To  put  this  equation  into  a  simpler  form,  we  introduce  a  new  coordinate  system 
X,  Y  whose  origin  U  is  situated  at  the  apex  of  the  curve,  i.e.,  at  the  point  where 
the  shadow  lies  at  noon;  the  X-axis  runs  toward  the  south  and  the  F-axis  toward 
the  west.  When  the  sun  is  at  meridian,  its  zenith  distance  is  p-b,  and  thus 


Uo  =  a  =  tan  (p  —  b) 


tan  p  —  tan  b  Qq  —  1 
1  +  tan  p  tan  b  Q  +  q 


We  accordingly  introduce 


x  -  a  -  X, 

into  the  above  curve  equation  and  obtain 

£  -  2<J(1  +  f)X  +  (Q2  -  ,2)X2 


or,  if  we  write  the  first  parenthesis  as  I/O2  and  the  second  as 


and  multiply  the  equation  by  02, 


Y‘  =  2QX  -  (l  - 

The  amplitude  equation  of  the  shadow  curve  thus  reads 

Y*  =  2tanPX-  (l  - 

r  \  cos2  p] 

The  curve  is  consequently  a  conic  section  with  the  half  parameter  tan  p  and 
the  form  number  {eccentricity)  cos  (p/ cos  p. 

If  the  latitude  is  equal  to  the  polar  distance  of  the  sun ,  then  the  shadow 
describes  a  parabola;  at  higher  latitudes  it  describes  an  ellipse ,  and  at  lower  a 
hyperbola. 


85 


Solar  and  Lunar  Eclipses 


To  determine  the  beginning  and  end  of  a  solar  eclipse,  together  with  the 
maximum  fraction  of  the  solar  disc  that  is  obscured,  if  the  right  ascensions, 
declinations,  and  radii  of  the  sun  and  moon  are  known  for  two  moments  in  time 
sufficiently  close  to  the  time  of  the  eclipse. 


Example.  At  the  famous  solar  eclipse  that  occurred  at  Athens  during  the 
Peloponnesian  War  on  August  3,  431  b.c.,  the  magnitudes  mentioned  had,  at 
4:30  p.m.  and  5:30  p.m.  mean  Athenian  time,  the  values 


and 


A o  -  126°  51'  52', 
o0  -  126°  40'  55', 


A0  =  19°  23'  46', 
80  =  19°  38'  58', 


/?0  =  15'  52', 
r0  =  15'  38.5' 


Ax  =  126°  54'  21', 
o,  =  127°  8'  49', 


A*  =  19°  23'  11', 
=  19°  24'  30', 


Rx  =  15'  52', 
rx  =  15'  36.5". 


A  solar  eclipse  can  only  occur  at  a  time  when  the  moon  is  sufficiently  close  to 
the  sun  on  the  celestial  sphere,  i.e.,  at  a  time  when  the  differences  a  =  a-  A  and 


d  =  S  -  A  between  the  right  ascensions  and  declinations  of  the  two  bodies  are 
sufficiently  small. 

The  spherical  cosine  theorem  gives  for  the  spherical  distance  z  of  the 
midpoints  of  the  two  bodies  (their  central  axis)  the  formula 


cos  z  =  sin  A  sin  8  +  cos  A  cos  8  cos  a. 

We  replace  cos  z  and  cos  a  here  by 


1  —  2  sin2  ^  and  1  —  2  sin2  ^ 


and  obtain 


1  —  2  sin2  ^  =  cos  d  —  2  cos  A  cos  8  sin2  ~ 

If  we  now  write  1-2  sin2  for  cos  d,  we  obtain 

sin2  ^  =  cos  A  cos  8  sin2  +  sin2 

If  we  now  consider  that,  according  to  our  assumption,  a  and  d  and,  therefore, 
also  z  are  small  angles  that  in  no  case  exceed  1°,  we  can  substitute  the  angles 
themselves  for  their  sine  (No.  15)  and  write 

z2  =  a 2  cos  A  cos  8  +  d2. 

If  in  addition  to  this  we  introduce  the  abbreviations 

V cos  A  cos  8  =  g  and  ag  =  x 
and  subtitute  y  for  d  we  obtain  simple  equation 

z2  *  x2  +  y2. 

The  magnitudes  a,  x,  y,  and  z  are  most  conveniently  measured  in  angular 
seconds. 

If  the  right  ascensions  and  declinations  of  the  moon  and  the  sun  for  two 
moments  of  time  sufficiently  close  to  the  time  of  the  eclipse  (the  first  moment 
being  taken  as  the  zero  point  of  time)  are  known  and  are,  for  example,  a0,  A0,  S0, 


and  A0  for  the  first  moment  and  cq,  A  h  8X  and  for  the  second,  then  we  also 

know  the  values  a,  d,  and  g,  and  therefore  also  x  =  ga  and  y  =  d  for  these 
moments  in  time,  and  we  can  calculate  from  these  the  hourly  increases  h  and  k  of 
x  and  y.  Since  the  eclipse  lasts  only  a  short  time,  we  can  assume  that  the 
magnitudes  x  and  y  change  uniformly  in  the  period  of  time  here  under 
consideration  and  that,  consequently,  at  time  t,  i.e.,  at  t  hours  after  moment  0, 

x  =  x 0  +  ht  and  y  =  y0  4-  kt. 

If  we  introduce  these  values  into  the  above  equation,  it  assumes  the  form 

z2  =  (*o  +  kt)*  +  (y0  +  kt)9, 

which  permits  us  to  calculate  the  central  axis  of  the  two  bodies  for  any  moment 

t. 

The  eclipse  begins  and  ends  at  the  moments  when  the  central  axis  z  is  equal 
to  the  sum  s  of  the  two  radii  R  and  r.  In  the  period  of  time  under  consideration 
the  solar  radius  does  not  change  (R  =  R0  =  R{),  while  the  lunar  radius  exhibits  the 
slight  hourly  increase  p=  — 2',  so  that 

r  =  r0  +  pt  and  s  —  R  +  r  =  R  +  r0  +  pt  =  j0  +  pt. 

We  therefore  obtain  for  the  desired  moment  t  of  the  beginning  (and  also  the 
end)  of  the  eclipse  the  so-called 
Eclipse  equation: 


(*o  +  ht)9  +  ( y0  +  kt)9  =  (*o  +  pt)9. 

This  quadratic  equation  has  two  roots  for  the  unknown  t;  the  smaller  value,  f, 
indicates  the  beginning  of  the  eclipse,  and  the  larger ,  f,  the  end. 

The  maximum  eclipse  occurs  at  the  moment  r  in  which  the  central  axis  z 
reaches  its  minimum  value  Thus,  we  have 


z2  =  zl  +  2  mt  +  n9t9, 


where 


A  =  *o  +  y2» 


m  =  XqH  4-  yji,  n2  =  h9  +  k9. 


If  we  write 


we  see  that  z  attains  its  minimum  value  when  the  bracket  disappears.  We  then 
have 


and 


At  the  moment  of  the  maximum  eclipse  the  moon  has  advanced  over  the  solar 
disc  by  (R  +  r  -  0/2 R  of  the  sun’s  diameter. 

The  fraction  of  the  solar  disc  that  is  covered  by  the  moon  at  that  moment  can 
also  be  calculated  easily  from  0 

Carrying  out  the  computations  for  the  Athenian  solar  eclipse ,  we  obtain: 


a0  =  — 657(— 10'  57*), 
l°g£o  =  9.97428, 

*0  =  -619.2, 
y0  =  +912(4-15'  12"), 
h  =  Xy  —  x0  =  1438, 
s0  =  1890.5,  sx  =  1888.5, 


aY  =  +868(+  14'  28') 
log  Si  =  9.97462, 

Xy  =  818.7, 
y,  =  +79(1'  19'), 

*  =  yi  -  yo  -  833, 

P  =  Sy  -  s0  =  -2 


and  the  ecilse  equation  is 


(-619  +  1438/) 3  +  (912  -  833/)2  =  (1890.5  -  2/)a 


or 


2761 729/ a  -  3292074/  -  2359085  =  0 


or 


/a  -  1.192034/  =  0.8542059159. 


Its  roots  are 


/'  =  -0.50373,  /'  =  1.69576. 


Converting  the  decimals  into  minutes  and  seconds,  we  obtain  -30  min  13  sec 


and  1  hr  41  min  45  sec,  respectively. 
Consequently: 


Beginning  of  eclipse :  3  hr  59  min  47  sec, 

End  of  eclipse :  6  hr  1 1  min  45  sec. 

The  length  of  the  eclipse  was  therefore  2  hr  12  min,  the  moment  of  maximum 
eclipse  5  hr  5  min  46  sec  [2r  =  t'  +  t"  gives  r  =  0.596].  The  central  axis  of  the  sun 
and  moon  at  this  moment  is  obtained  from 

?  =  (619  -  1438- 0.596) a  +  (912  -  833  0.596)a; 


it  is 


£  =  V238a  +  415.52  =  479,  i.e.,  8'. 

The  moon  then  covers  }£££,  i.e.,  74%  of  the  central  solar  diameter  and  67%  of 
the  solar  disc. 

Lunar  eclipses  are  treated  in  a  similar  way.  But  here,  instead  of  being 
concerned  with  the  sun,  we  are  concerned  with  the  so-called  shadow  circle,  i.e., 
the  cross  section  of  the  conical  shadow  (the  umbra)  cast  by  the  sun-illuminated 
earth  at  the  distance  of  the  moon.  The  angle  radius  is  equal  to  p  -  k,  where  p 
represents  the  lunar  parallax*  and  k  represents  the  half  aperture  angle  of  the 
conical  shadow.  K  is  the  excess  of  the  angle  radius  R  over  the  parallax*  P  of  the 
sun. 

[In  the  Figure  101,  let  S  be  the  center  of  the  sun,  E  the  center  of  the  earth,  K 
the  apex  of  the  conical  shadow,  AB  the  diameter  of  the  shadow  circle,  se  a 
tangent  to  the  periphery  of  the  sun  and  the  earth,  EF  the  perpendicular  to  Ss  from 
E,  and  thus  2iEAe  =p,  &AEK  =  sJt  and  &FES  =  &.eKE  =  k.  Since  p  is  an  external 
angle  of  the  triangle  EKA,  we  have  p  =  dt  +  k.  It  also  follows  from  A  SEE  that 

SF  Ss  FA 
smK=zSE~SE~SE‘ 


FIG.  101. 


Since  the  minuend  of  the  right  side  is  the  sine  of  the  angle  radius  of  the  sun 
and  the  subtrahend  is  the  sine  of  the  solar  parallax,  it  follows  that 


sin  k  —  sin  R  —  sin  P 

or,  because  the  angle  involved  is  so  small  ( K  is  smaller  than  16.2',  R  <  16.3',  and 
P  <  8.9"), 


K  =  R  -  P, 


as  was  asserted  above.] 

The  right  ascension  of  the  center  of  the  shadow  circle  is  the  right  ascension  of 
the  sun  increased  or  diminished  by  180°  and  the  declination  is  the  reciprocal 
value  of  the  solar  declination. 

In  order  to  take  account  of  the  atmospheric  refraction,  in  computing  a  lunar 
eclipse  the  theoretical  value  for  the  radius  of  the  shadow  circle  given  above,  SR  = 
p  +  P  -  R,  must  be  replaced  by  a  value  2%  greater. 
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Sidereal  and  Synodic  Revolution  Periods 


To  determine  the  synodic  revolution  period  of  two  coplanar  rotation  rays  for 
which  the  sidereal  revolution  periods  are  known. 


A  rotation  ray  is  a  line  segment  AB  of  invariable  length  the  end  point  B  of 
which  rotates  about  the  starting  point  A  in  a  plane  E  at  a  constant  rate  of 
revolution,  while  the  starting  point  either  remains  at  rest  or  describes  a  curve  of 
plane  E.  Using  a  well-known  astronomical  expression  we  call  the  time  T  in  the 
course  of  which  the  rotation  ray  AB  describes  one  complete  revolution  of  360° 


its  sidereal  revolution  period. 

Let  a  second  rotation  ray  of  the  plane  E  with  the  starting  point  a  and  the  end 
point  b  have  the  sidereal  revolution  period  t(<T). 

We  will  consider  the  angle  that  the  two  rays  form  with  each  other  at  a  given 
moment  of  time.  The  time  s  at  the  end  of  which  they  once  again  form  the  same 
angle  we  will  call  the  synodic  revolution  period  of  the  two  rays  or  the  synodic 
revolution  period  of  the  one  ray  with  respect  to  the  other. 

In  order  to  find  this  we  will  imagine  an  auxiliary  rotation  ray  a'b'  whose 
starting  point  a'  always  coincides  with  A  and  whose  direction  always  agrees  with 
that  of  ab,  and  we  will  now  consider  the  relative  rotation  of  this  auxiliary  ray 
with  respect  to  AB.  Since  the  rotation  of  a'b'  (or  ab)  in  the  unit  time  is  equal  to 
3607 1  and  that  of  AB  is  360 °/T,  the  relative  rotation  of  a  'b'  with  respect  to  AB  in 
each  time  unit  is 

(1)  8  =  (1  -  f)360°- 

If  a'b'  resumes  the  same  position  with  respect  to  AB  at  the  end  of  s  units  of  time, 
then  s5  must  equal  360°  or 

(2)  8  =  -  360°. 

s 

From  (1)  and  (2)  it  follows  that 

111  Tt 

s  ~  t  T  °T  S  ~  T~^T 

and  thus  the  synodic  revolution  period  s  is  represented  as  a  function  of  the  two 
sidereal  revolution  periods  T  and  t. 

This  unpretentious  problem,  the  solution  to  which  is  also  a  model  of  brevity 
and  simplicity,  nevertheless  possesses  noteworthy  applications,  four  of  which  we 
will  discuss. 

Problem  1.  The  hands  of  a  clock  are  superimposed  one  on  the  other  at 
exactly  12:00;  when  is  the  next  time  they  are  exactly  superimposed  one  on  the 
other  ? 

Here  let  AB  be  the  small  hand,  ab  =  Ab  the  big  hand,  T=  12  hr,  t  =  1  hr,  thus 
s  =  —  1/-  hr  =  1  hr  5  min  27/j-  sec. 

The  event  takes  place  at  5  min  27/,-  sec  after  1:00. 

Problem  2.  From  the  synodic  revolution  period  (583$  days)  of  Venus, 


determine  its  sidereal  revolution  period. 

The  sidereal  revolution  period  of  a  planet  is  understood  to  mean  the  time  in 
which  the  rotation  ray  sun-planet  makes  one  complete  revolution.  The  synodic 
revolution  period  of  the  planet  is  understood  to  mean  the  time  s  at  the  end  of 
which  the  three  celestial  bodies  sun,  earth,  planet  are  once  again  in  the  same 
position  with  respect  to  one  another. 

Here  AB  is  the  rotation  ray  sun-earth,  ab  the  rotation  ray  sun- Venus,  and  T  = 
365f  days.  The  synodic  revolution  period  s  of  Venus  has  been  determined  by 
observations.  Its  sidereal  revolution  period  t  is  obtained  from  the  relation 

1  _  I  _  I 

7  T~  s 


as  224.7  days 

Problem  3.  To  determine  the  relation  between  the  solar  day  and  the  sidereal 
day. 

A  solar  day  is  the  time  interval  between  two  successive  culminations  of  the 
sun,  a  sidereal  day  the  time  interval  between  two  successive  culminations  of  a 
fixed  star  or  the  time  interval  within  which  the  earth  rotates  once  about  its  own 
axis. 

Let  the  midpoint  of  the  sun  be  S,  that  of  the  earth  E,  a  marked  point  of  the 
earth’s  equator  O.  Here  AB  is  the  rotation  ray  SE,  ab  the  rotation  ray  EO,  T  is 
here  365f  days  (1  year,  the  period  of  time  in  which  AB  =  SE  completes  one  full 
revolution  of  360°),  t  the  length  of  a  sidereal  day,  and  s  the  length  of  a  solar  day 
(the  period  of  time  at  the  end  of  which  the  ray  EO  is  once  again  in  the  same 
position  relative  to  the  sun).  From 


1  1  _  1_ 
s  ~  7  T 

we  obtain 

T  T  , 

7  =  7  +  ' 

T/t  represents  the  number  of  sidereal  days,  T/s  the  number  of  solar  days,  that 
occur  in  a  year.  The  sought-for  relation  can  accordingly  be  stated  in  the 
following  form: 

A  year  contains  one  more  sidereal  day  than  the  number  of  solar  days  (365  f 


solar  days,  366T  sidereal  days). 

Problem  4.  What  is  the  relation  between  the  sidereal  and  synodic  month? 

A  sidereal  month  is  the  time  it  takes  the  rotation  ray  EM  (earth-moon)  to 
complete  one  full  revolution.  A  synodic  month  is  the  time  interval  between  two 
successive  new  moons  (full  moons).  Here  AB  is  the  rotation  ray  SE,  ab  the 
rotation  ray  EM,  T=  365  f  days,  t  the  length  of  the  sidereal  month,  s  the  length  of 
the  synodic  month.  The  sought-for  relation  accordingly  reads 

1  _  1  _  J_ 

T  7  "  r* 

Verbally  it  can  be  stated  as  follows:  The  reciprocal  of  the  synodic  month 
subtracted  from  the  reciprocal  of  the  sidereal  month  is  equal  to  the  reciprocal  oj 
the  sidereal  year. 

This  can  be  confirmed  for  the  numerical  values: 

t  =  27.3217  days,  s  =  29.5306  days,  T  =  365.2564  days. 
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Progressive  and  Retrograde  Motion  of  the  Planets 


When  does  a  planet  pass  from  progressive  to  retrograde  motion  (or 
conversely,  from  retrograde  to  progressive  motion)  ? 


The  planetary  orbits,  considered  as  circles  on  the  ecliptic  plane,  their  orbital 
radii  and  revolution  periods,  as  well  as  their  positions  at  a  given  moment  of  time 
serving  as  the  starting  point  of  the  time  record  are  assumed  to  be  known. 

Solution.  The  motion  of  a  planet  is  conventionally  called  progressive  when 
it  travels  among  the  fixed  stars  of  the  celestial  sphere  like  the  sun,  i.e.,  from  west 
to  east,  and  retrograde  when  it  travels  in  the  opposite  direction,  i.e.,  from  east  to 
west.  The  transition  from  one  motion  to  the  other  occurs  when  the  planet  appears 
to  be  stationary  for  a  brief  period  among  the  fixed  stars,  in  other  words,  when  the 
sight-line  “earth-planet”  retains  the  same  direction  for  a  short  period  of  time. 

The  earth  and  the  planet  have  the  orbital  radii  r  and  R,  respectively,  and  the 
revolution  periods  u  and  U,  and  the  orbital  radii,  which  are  rotating  about  the 
sun,  accordingly  have  the  rates  of  revolution  k  =  2 tt/u  and  K  =  2 n/U. 

The  solution  to  the  problem  is  most  conveniently  obtained  by  the  vector 
method.  Let  O,  p,  P  be  the  midpoints  of  the  sun,  the  earth,  and  the  planet, 
x  =  Op  and  <r  =  OP  =  the  vectorial  distances  of  the  earth  and  the  planet  from  the 


sun.  The  vectors  t  and  9?  are  “rotational  vectors,”  i.e.,  vectors  with  the  constant 
lengths  r  and  R,  that  rotate  in  the  ecliptic  plane  E  with  constant  velocities  k  and 
K,  respectively,  about  their  fixed  point  of  origin  O.  For  the  vectors  r  and  ft  of 
the  orbital  velocities  we  again  select  O  as  the  starting  point.  The  magnitudes  of 
the  velocities  r  and  are  kr  and  KR,  the  directions  always  perpendicular  to  the 
directions  of  r  and  ft.  If  we  then  imagine  two  vectors  r  0  and  9t0  situated  in  E, 
originating  at  O,  and  possessing  the  magnitudes  r  and  R  that  are  always  90°  in 
advance  of  the  rotational  vectors  r  and  9i ,  then 

f  =  kx0  and  ft  =  /l$R0- 

The  vectorial  distance  of  the  planet  from  the  earth  is 
OP  -  Op  =  ft  -  t,  the  relative  velocity  of  the  planet  with  respect  to  the  earth  (i.e., 
the  velocity  of  the  planet  for  an  observer  on  the  earth,  for  whom  the  earth  is  at 
rest)  is  thus 


&  =  Sft  —  t  =  Kft0  —  bc0. 

Let  the  angle  by  which  the  vector  9t  is  in  advance  of  the  vector  r  at  time  0  be 
a  and  at  time  t  let  it  be  Then 

(1)  i  -  «  +  */, 

where  k  =  K-k represents  the  angle  by  which  the  vector  9t  rotates  in  advance  of 
the  vector  r  in  the  unit  time. 

The  motion  of  the  planets  is  then  progressive  when  the  vector  a  rotates  in  a 
counterclockwise  direction  for  an  observer  at  the  North  Pole  and  retrograde 
when  it  rotates  in  a  clockwise  direction  for  this  observer,  i.e.,  in  accordance  with 
whether  the  apex  S  of  the  vector  oJ  =  g  x  §  that  is  perpendicular  to  E  lies  above 
or  below  the  ecliptic  plane.  Now, 

3  x  &  =  (91  —  t)  x  (ft  -  x)  =  (SR  —  r)  x  (/T9t0  -  **o)  **  P  —  Q 


with 


p  =  kft  x  9t0  +  fa  x  *o>  q  =  Kr  x  9t0  +  kft  x  r0, 

it  being  assumed  that  the  vectors  p  and  q  also  have  their  starting  point  at  O.  The 
vector  p  has  the  magnitude  KR2  +  kr 2  and  lies  above  E.  The  vector  q,  as  may  be 


seen  from  Figure  102,  lies  above  or  below  E  accordingly  as  cos  C,  is  positive  or 
negative,  and  has  the  magnitude  (A:  +  k)Rr |cos  The  vector  g  x  &  thus  lies  above 
or  below  E 


FIG.  102. 

accordingly  as  KR2  +  kr 2  -  (K  +  k)Rr  cos  C,  is  positive  or  negative,  i.e., 
accordingly  as 


cos  C  ^ 


KR2  +  kr 2 
{K  +  k)Rr 


Now,  according  to  Kepler’s  third  law, 

U2:u2  =  R3:r 3  or  k*:K2  =  R 3:r3, 

so  that  the  ratio  k:  K  on  the  right  side  of  the  obtained  inequality  can  be  replaced 
by  W3:  w3,  where  W=  Vr,  w  =  Vr.  We  thus  obtain  for  this  right  side  the  value 

w3 W*  +  lV3w*  (IV  4-  w)Ww  W iv 

( W 3  +  w3)  ^w3  =  It’3  +  w3  =  W2  +  w2  -  Ww 

_ 

R  +  r  -  VRr 


and  our  conclusion  reads  : 

The  motion  of  a  planet  is  progressive  or  retrograde  accordingly  as 


cos  £  £ 


VRr 

R  +  r  -  VRr 


At  the  moments  when 


(2) 


VRt 

R  +  r  -  VRt 


cos  £  = 

the  one  type  of  motion  changes  into  the  other. 

Example.  HOW  many  days  after  upper  conjunction  does  Venus  become 
retrograde  ? 

Here  r  =  149,  R  =  107.5  million  kilometers,  k  and  K,  respectively,  in  degrees 
are  0.9856°  and  1.602°,  k  thus  equals  0.6164°  per  day,  with  a  =  180°  and 
\ffoj(R  +  r  -  VTr)  =  0.974.  From  (1)  and  (2)  we  therefore  obtain  cos  0.6164?  = 
-0.974  and  from  this  t  =  271  days. 
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Lambert’s  Comet  Problem 


To  express  the  time  required  for  a  comet  to  describe  an  arc  of  its  parabolic 
orbit  by  means  of  the  focal  radii  and  the  chord  connecting  the  end  points  of  the 
arc. 


Johann  Heinrich  Lambert  (1728-1777)  in  1761  published  a  paper  on  comet 
orbits  in  which  may  be  found  the  celebrated  formula  bearing  his  name;  the 
formula  represents  the  area  of  a  parabolic  focal  sector  as  a  function  of  the 
bounding  focal  radii  and  the  sector  chord. 

For  the  derivation  of  the  Lambert  formula  we  require  a  formula  of  the 
English  astronomer  Barker,  which  we  will  derive  first. 

We  begin  with  the  amplitude  equation  of  a  parabola,  y1  =  4 kx,  in  which  k 
represents  the  shortest  focal  radius,  which  is  commonly  known  to  be  one  fourth 
of  the  parabola  parameter. 

Let  us  consider  the  sector  FOP ,  which  is  enclosed  by  the  minimum  focal 
radius  FO,  the  focal  radius  FP  =  r  of  an  arbitrary  point  P(x\y),  and  the  parabola 
arc  OP ,  and  in  which  the  angle  OFP  =  W  represents  the  so-called  true  anomaly 
of  the  point  P. 

Barker’s  problem  is  stated  thus:  Represent  the  area  of  the  parabola  sector  as 
a  function  of  the  anomaly. 

In  order  to  solve  the  problem  we  first  express  the  sector  area  S  in  terms  of  x 
and  y.  If  we  drop  the  perpendicular  PQ  from  P  to  the  axis,  S  is  the  difference 
between  the  area  of  the  half  sector  OPQ  (cf.  No.  56)  and  the  area  of  the  triangle 
FPQ,  so  that 


S  =  fxy  —  •$(*  —  k)y  or  6 S  —  y(x  +  3k). 


We  then  express  x  and  y  in  terms  of  W.  According  to  the  polar  coordinate 
theorem  of  the  parabola,  the  focal  radius  is 

r  =  p  =  k 
1  +  cos  W  „  IV* 

COS2  y 


and  consequently 


w  w  .  w 

r  sin  W  —  2r  sin  cos  =»  2k  tan 


and 


x  —  y3l4k  =  k  tan2 


W 

2 


If  we  introduce  Barker’s  auxiliary  magnitude 


T  = 


we  obtain 


X  =  kT2,  y  =  2  kT 

(the  equation  of  the  parabola  in  a  parametric  form),  and  after  substitution  of 
these  values  into  the  above  area  formula,  we  obtain 

S  =  k\T+  $T*). 


This  is  Baker  s  formula. 


FIG.  103. 

W  is  positive  or  negative  accordingly  as  P  lies  above  or  below  the  axis.  In  the 
first  case,  T  and  S  are  positive;  in  the  second,  negative. 

Now  for  the  solution  of  Lambert’s  problem! 

Let  P  and  P'  be  two  points  of  the  parabola,  W  and  W'  their  anomalies,  T  and  T 
'  the  corresponding  Barker  auxiliary  magnitudes,  S  and  i S'  the  areas  of  the 
sectors  FOP  and  FOP',  with  FP  =  r  and  FP'  =  r'  as  the  focal  radii  of  the  two 
points,  2I PFP'  =  2C,  the  angle  between  them,  PP'  =  s  the  connecting  chord,  and  a 
the  area  of  the  sector  PFP'  enclosed  by  the  two  focal  radii.  Let  r  lie  above  the 
axis  and  r'  above  or  below  it;  in  the  first  case,  let  r'  <  r,  and  thus  in  both  cases  W 
<  W. 

The  area  o  is  then  in  both  cases  the  difference  S  -  S'. 

Now,  according  to  Barker, 

35  =  k3(3T  +  T3),  35'  =  k*{3T'  +  T'3), 


and  consequently, 

3 a  =  k3(T-  r)[3  +  T2  +  T2  +  TT']. 

Using  the  abbreviations  J,  O,  J' ,  O'  for 

.  W  W  .  W  W' 
sin  — »  cos  — »  sm  cos 

and  i,  o  for  sin  C,,  cos  C,,  we  can  write  the  factor  in  parentheses  as 


and  the  factor  in  square  brackets  as 


[  1 


1  +  t2  +  1  +  T'2  +  1  +  TT 

J*  J'*  JJ ' 

+  na  +  +  r\'i  +  +  nrv 


0 2 
0 2  +  J2 
0 2 


O' 

O' 2  +  J'a 

+  qT2  + 


00' 

00’  +  JJ 
00' 


1  1  0 
Q2  +  Q'2  +  00’’ 


If  we  introduce  these  values  and,  in  accordance  with  the  polar  equation,  express 
k/O2  and  k/O'2  as  r  and  r',  respectively,  we  obtain 

3a  =  i(r  4 -  r'  +  oVr?)\/rn'. 


Now, 


Ia  =  (JO'  -  OJ')2  =  J20'2  +  02J'2  -  2 JOJ'O' 

=  (1  -  02)0'2  +  (1  -  0'2)0 2  -  2 JOJ'O' 

=  O2  +  O' 2  -  2 00'(00'  +  JJ')  =  02  +  O' 2  -  2 oOO\ 

and,  since  k  =  rO 2  =  r'O'2, 


*  =  Vk(r  +  r'  -  2o\/r?) I Vrr' . 

If  we  introduce  this  value  into  the  equation  found  for  3cr,  we  obtain 

3a  =  (r  +  r'  4-  oVrr')V k(r  +  r'  —  2 oVrr'). 

We  transform  this  equation  further  by  introducing  the  chord  s.  Its  square, 
according  to  the  cosine  theorem,  is 

Ja  -  ra  4-  r'2  -  2 rr'  cos  2{  =  ra  4-  r'2  -  2rr'(2o2  -  1), 


i.e. 


s2  =  (r  +  r')2  —  \rr'o2. 


From  this  we  obtain 


4 rr'o3  =  (r  +  r'  +  *)(r  +  r'  -  s). 


We  abbreviate  and  write 


v  =  Vr  +  r'  +  s,  u  =  Vr  +  r'  -  s, 


obtaining 


where  the  upper  sign  applies  when  the  enclosed  angle  2C,  is  concave  and  the 
lower  when  it  is  convex. 

If  we  substitute  these  two  values  into  our  last  formula  for  3a,  it  finally  yields 


or,  in  complete  form, 


This  formula  represents  the  parabola  sector  o  as  a  function  of  the  two 
bounding  focal  radii  r  and  r’  and  the  chord  s  connecting  their  end  points. 

In  order  to  use  this  formula  to  determine  the  time  required  for  a  comet  to 
complete  its  orbital  arc,  we  need  only  introduce  the  value  found  for  a  into  the 
Gauss  formula  of  the  Theoria  motus. 


tVpV\  +  n 


(cf.  No.  96). 

Since  here  p  =  2k  and  the  comet  mass  p  is  to  be  set  equal  to  zero,  we  have 
initially 


GtVk  =  aV 2 


and,  as  a  result  of  substitution, 


6 Gt  =  (r  +  r'  +  s)"  +  (r  +  r'  - 


This  remarkable  formula  contains  the  solution  to  the  problem  posed.  It  is 
usually  called  the  Lambert  formula,  although  it  had  already  been  formulated  by 
Euler. 

It  states  that  the  time  required  by  a  comet  to  describe  an  orbital  arc  depends 
only  on  the  arc  chord  and  the  sum  of  the  focal  radii  of  the  ends  of  the  arc. 

According  to  Lagrange,  Lambert’s  formula  represents  the  most  beautiful  and 
significant  discovery  in  the  theory  of  comet  motion.  It  is,  in  fact,  of  fundamental 
importance  for  the  determination  of  comet  orbits. 

This  determination  is  carried  out  essentially  in  the  following  way: 

The  longitude  and  latitude  of  the  comet  is  determined  for  three  different 
moments  of  time,  together  with  the  corresponding  longitude  and  distance  of  the 
sun  (from  the  earth).  Let  r  and  r'  be  the  respective  focal  radii  of  the  first  and 
third  time  of  measurement,  s  the  distance  between  the  ends  of  the  focal  radii,  r' 
and  s  are  expressed  in  terms  of  the  known  magnitudes  and  r,  and  these  values  are 
substituted  into  the  Lambert  equation,  which  results  in  an  equation  with  only  one 
unknown,  r.  Lrom  this  equation  r  is  obtained,  and  then  r'  and  s  are  found  from 
the  previously  mentioned  expressions.  This  then  gives  us  the  focus  and  two 
points  of  the  orbit,  so  that  it  is  completely  determined.  When  the  Gauss  formula 
is  applied  to  one  of  the  points,  we  obtain  the  time  at  which  the  comet  passes  the 
perihelion.  After  this  has  been  determined,  the  position  of  the  comet  for  any 
moment  of  time  can  be  obtained  from  the  Gauss  formula. 


*  Douwes  was  a  Dutch  admiralty  mathematician. 

*  This  formula  is  obtained  as  follows:  Since  the  circle  sector  ONPq  has  the  area  Jq  =  E  and  each 
ordinate  of  the  elliptical  sector  ONP  is  equal  to  b/a  times  the  circle  ordinate  at  that  point,  the  area  of  the 
sector  ONP  is  also  equal  to  bta  times  Jq,  i.e.,  \abE.  Consequently,  the  area  J  of  the  elliptical  sector  SNP 
that  is  smaller  than  ONP  by  the  area  \ey  =  ^abe  sin  E  of  the  triangle  OSP,  is 

J  —  \abE  —  \ab -e- sin  E. 

*  The  lunar  or  solar  parallax  is  the  angle  radius  of  the  earth  on  the  moon  or  sun,  respectively. 
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Steiner’s  Problem  Concerning  the  Euler  Number 


At  what  value  of  x,  if  x  is  a  positive  variable ,  will  the  expression  be  at  a 
maximum  ? 

Jacob  Steiner  posed  this  problem  in  Crelle’s  Journal,  vol.  XL;  it  may  also  be 
found  in  his  Works,  vol.  2,  p.  423. 

Solution.  According  to  the  inequality  of  exponential  functions  (No.  12), 

g(x  -  e)le  £  1  +  LZJ, 
e 

where  the  equal  sign  applies  only  when  x  =  e.  The  inequality  is  simplified  to 

e*i* >  -  or  to  e*1*  ^  x. 
e  e 

Here  we  extract  the  xth  root  and  obtain 

<Te>  </x. 

Verbally  expressed:  The  Euler  number  e  is  the  number  yielding  the  maximum 
possible  value  for  the  expression  for  which  x  is  a  positive  variable. 
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Fagnano’s  Altitude  Base  Point  Problem 


To  inscribe  in  a  given  acute-angled  triangle  the  triangle  of  minimum 
perimeter. 


This  celebrated  problem  stems  from  I.  F.  Fagnano,  son  of  the  Italian  count  C. 
Fagnano  (1682-1766),  who  became  famous  as  a  result  of  his  remarkable  studies 
of  lemniscate  partition. 

The  following  solution  of  the  problem  is  distinguished  by  its  extreme 
simplicity.  It  comes  from  Fr.  Gabriel-Marie,  author  of  the  excellent  book 
Exercices  de  Geometrie. 

Let  the  given  triangle  be  ABC  and  let  XYZ  be  a  triangle  inscribed  in  it,  with  X, 
Y,  and  Z  on  BC,  CA,  and  AB,  respectively.  We  will  initially  consider  that  Z  is 
arbitrarily  situated  on  AB;  we  draw  its  mirror  images  H  and  K  on  BC  and  CA, 
respectively,  and  determine  the  points  of  intersection  X  and  Y  of  the  connecting 


line  HK  with  BC 


c 


and  CA.  For  a  fixed  point  Z  the  triangle  XYZ  thus  formed  has  the  smallest 
perimeter  of  all  the  inscribed  triangles.  In  fact:  let  X'  and  T  be  two  other  points 
on  BC  and  CA.  Since  ZX  and  HX'  are  mirror  images,  and  also  ZY'  and  KY',  and 
naturally  also  ZX  and  HX,  as  well  as  ZY  and  KY,  the  perimeters  of  the  two 
inscribed  triangles  to  be  compared  can  be  written  as 

ZXYZ  =  HX  +  XY  +  YK  ~  HK, 

ZX'Y'Z  =  HX'  +  X'Y'  +  Y'K  -  HX'Y'K. 

However,  since  the  direct  path  HK  from  H  to  K  is  shorter  than  the  roundabout 
path  HX'Y'K,  the  first  triangle  possesses  a  smaller  perimeter  than  the  second. 

It  now  merely  remains  to  choose  the  point  Z  in  such  manner  as  to  obtain  the 
smallest  possible  segment  HK  (which  represents  the  perimeter  of  XYZ).  Now  CZ 
is  the  mirror  image  of  CH  and  also  of  CK;  likewise,  fiZCB  =  /CHCB  and  yfi 
ZCA  =  KCA  and  thus  /JACK  =  2 y.  Segment  HK  is  therefore  the  base  of  an 
isosceles  triangle  (HKC)  with  a  constant  apex  angle  2 y  and  the  variable  leg  s  = 
CZ;  as  such  it  attains  a  minimum  when  CZ  is  at  a  minimum,  i.e.,  when  CZ  is 
perpendicular  to  AB. 

Since  we  could  just  as  easily  have  carried  out  the  investigation  with  X  or  Y  as 
with  Z,  AX  is  perpendicular  to  BC  and  BY  to  CA.  The  points  X,  Y,  Z  are  thus  the 
base  points  of  the  altitudes  of  the  triangle  ABC. 

Result:  Of  all  the  triangles  that  can  be  inscribed  in  a  given  acute-angled 
triangle,  the  one  with  the  smallest  perimeter  is  the  triangle  formed  by  the  base 
points  of  the  altitudes. 
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Fermat’s  Problem  for  Torricelli 


To  find  the  point  the  sum  of  whose  distances  from  the  vertexes  of  a  given 
triangle  is  the  smallest  possible. 

This  celebrated  problem  was  put  by  the  French  mathematician  Fermat  (1601- 
1665)  to  the  Italian  physicist  Torricelli  (1608-1647),  the  famous  student  of 
Galileo,  and  was  solved  by  the  latter  in  several  ways. 

The  simplest  solution  is  the  one  obtained  by  the  use  of 
Viviani’S  theorem:  In  an  equilateral  triangle  the  sum  of  the  three  distances 
of  a  point  from  the  sides  of  a  triangle  has  a  value  that  is  independent  of  the 
position  of  the  point. 

This  value  is  equal  to  the  altitude  of  the  triangle. 

Viviani  (1622-1703),  an  Italian  mathematician  and  physicist,  was  a  student  of 
Galileo  and  Torricelli. 

In  Viviani’s  theorem  the  distance  of  a  point  from  a  triangle  side  is  reckoned 
as  positive  when  it  is  inside  the  triangle  and  negative  when  it  is  outside. 

Proof.  Let  the  equilateral  triangle  have  the  vertexes  P,  Q,  and  R,  the  side  g, 
the  altitude  h,  and  the  area  J.  If  x,  y,  z  are  the  distances  of  an  arbitrary  point  O 
from  the  sides  QR,  RP,  PQ,  then 

s  =  x  +  y  +  z 

is  the  designated  sum. 


c 


FIG.  105 

Now,  the  area  of  the  triangle  PQR  is  composed  (additively  or  subtractively) 
of  the  three  component  triangles  OQR,  ORP,  OPQ,  so  that  we  obtain  the 
equation 


ig*  +  1st  +  tez  =  J 


no  matter  what  position  the  point  O  may  have.  From  this  we  obtain  directly 


and  thus  the  auxiliary  theorem  is  proved.  Now  let  ABC  be  the  given  triangle.  We 
choose  the  point  O  so  that  the  three  perpendiculars  at  A,  B,  C  to  AO,  BO,  CO 
form  an  equilateral  triangle  PQR.  Let  O'  be  any  other  point.  Then  if  O'A',  OB', 
O'C'  are  the  perpendiculars  dropped  from  O'  to  QR,  RP,  PQ,  we  have 

A'O'  Z  AO',  B'O'  £  BO',  C'O'  <  CO', 

where,  however,  the  equal  sign  cannot  apply  to  all  three.  By  addition  it  follows 
from  this  that 

(1)  A’O'  +  B’O'  +  CO'  <  AO'  +  BO'  +  CO'. 

However,  according  to  the  auxiliary  theorem  as  applied  to  the  equilateral  triangle 

PQR , 

(2)  AO  +  BO  +  CO  z  A'O'  +  B'O'  +  C'O', 

where  the  equals  sign  applies  when  O'  is  inside  the  triangle  PQR  and  the  “smaller 
than”  sign  when  O'  is  outside.  From  (2)  and  (1)  we  get 

AO  +  BO  +  CO  <  AO'  +  BO'  +  CO', 

so  that  AO  +  BO  +  CO  is  the  smallest  possible  sum  of  the  distances. 

Since  the  quadrilaterals  OBPC,  OCQA,  OARB  are  circle  quadrilaterals,  each 
of  the  three  angles  BOC,  COA,  and  AOB  is  equal  to  120°. 

The  point  we  are  looking  for  is  accordingly  the  common  point  of  intersection 
of  the  three  circle  arcs  with  the  chords  BC,  CA,  AB  and  the  common  peripheral 
angle  of  120°. 

The  construction  of  this  point  is  impossible  when  one  triangle  angle,  for 
example,  -£ACB  =  y  reaches  or  exceeds  120°. 

In  that  event  C  itself  is  the  point  O  that  we  are  looking  for.  Specifically,  in 
this  case, 


AC  +  BC  <  AU  +  BU  +  CU, 


no  matter  where  the  point  U  may  be. 

Proof.  We  introduce  the  angles  ACU  =  i//  and  BCU  =  (p.  If  U  lies  in  the 
space  enclosed  by  the  angle  ACB  =  y,  the  sum  of  i//  and  (p  is  equal  to  y;  if  U  lies 
in  the  space  enclosed  by  the  adjacent  angle  of  y,  the  difference  between  these 
two  angles  is  equal  to  y;  and,  finally,  if  U  lies  in  the  space  of  the  opposite  angle 
from  y,  then 


0  +  ?>  =  360°  —  y. 

Let  the  base  points  of  the  perpendiculars  dropped  from  U  to  AC  and  BC  be  F 
and  G.  Their  distances  from  C  are  then 

x  =»  CU  cos  ifi  and  y  =  CU  cos  tp, 

with  such  a  distance,  e.g.,  x,  being  counted  as  positive  when  cos  i//  is  positive  or 
negative  when  cos  yj  is  negative.  In  each  case  then  we  have 

AC  **  AF  +  *  and  BC  —  BG  +  y, 


and  accordingly 


AC  +  BC  —  AF  +  BG  +  x  +  y. 


Now 


x  +  y  *  CU  cos  *!>  +  CU  cos  «p  —  CU  (cos  ^  +  cos  <p) 

o  r>rr  0  +  <P _ 0-9 

=  ZLU  ■  cos — - — cos — - — 

Since,  according  to  the  above,  one  of  the  two  cosines  of  the  right  side  of  this 
equation  has  the  magnitude  cos  (y/2),  and  this  (because  y/2  •  60°)  is  smaller  than 
) ,  the  right  side  has  a  maximum  magnitude  of  CU.  This  yields 

AC  +  BC  <  AF  +  BG  +  CU. 

Since  the  legs  AF  and  BG  of  the  right  triangles  AUF  and  BUG  are  smaller  than 
the  hypotenuses  A U  and  BU,  it  is  certainly  true  that 


AC  +  BC  <  AU  +  BU  +  CU. 


0,-E.D. 
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Tacking  Under  a  Headwind 


How  must  a  sailboat  tack  with  a  north  wind  in  order  to  get  north  as  quickly 
as  possible  ? 

Solution.  Let  the  course  of  the  boat  be  Oy°N,  and  let  the  sail  form  the  acute 
angle  a  with  the  bearing  north  and  the  angle  f>  with  the  course  bearing. 

First  let  us  solve  the  preliminary  problem:  Let  the  maximum  speed  that  a 
sailboat  can  make  through  the  wind  with  the  most  favorable  sail  position  be  C 
knots;  how  great  a  speed  can  it  make  when  the  angle  of  the  sail  with  the  bearing 
of  the  wind  is  a  and  with  the  axis  of  the  boat  is  fL? 

Let  the  pressure  exerted  upon  the  sail  by  the  wind  when  the  sail  is 
perpendicular  to  the  wind  be  P.  If  the  sail  forms  an  angle  a  differing  from  90° 
with  the  bearing  of  the  wind,  then  the  wind  pressure  P'  (which  works 
perpendicular  to  the  sail)  is  smaller.  It  is  reasonable  to  assume  that  the  wind 
pressure  is  now  equal  to  only  sin  a  times  P,  so  that  P'  =  P  sin  a.  This  formula, 
conceived  by  Lossl,  is,  however,  only  approximate. 


We  divide  P'  into  two  components:  one,  p  =  P'  sin  /?,  in  the  direction  of  the 
boat  axis;  the  other,  q  =  P'  cos  /?,  perpendicular  to  it.  Of  these  components  p  is 
the  only  relevant  one  for  the  forward  motion  of  the  boat.  Thus,  pressure 
exercised  by  the  wind  on  the  boat  in  the  course  direction  has  the  value 

p  =  P  sin  a  sin  p. 

The  velocity  c  of  the  boat  is  proportional  to  this  pressure: 


c  =  kp  =  kP  sin  a  sin  p, 


where  k  represents  the  proportionality  constant.  For  a  =  fi  =  90°  this  formula 
becomes 


cmax  —  C  =  kPt 

so  that  we  can  replace  kP  in  the  formula  by  C.  The  solution  to  our  preliminary 
problem  thus  reads 


c  =  C  sin  a  sin  /3. 

This  formula  forms  the  basis  of  the  solution  of  the  main  problem.  C  is  here  the 
velocity  that  the  north  wind  gives  to  the  boat  when  it  travels  due  south  and  the 
sail  is  perpendicular  to  the  wind  direction.  If  the  boat  is  to  get  as  far  north  as 
possible  in  a  given  time,  the  northerly  component  c'  of  the  boat’s  velocity  c  must 
be  at  a  maximum.  This  component  is,  however, 

c'  =  c  sin  y  —  C-sin  a  sin  j9  sin  y. 

Consequently,  what  is  necessary  is  to  choose  the  three  angles  a,  ft,  y,  the  sum  of 
which  is  90°,  in  such  manner  as  to  obtain  the  maximum  product  for  sin  a  sin  ft 
sin  y. 

This  reduces  our  task  to  the  following  problem: 

When  is  the  product  of  the  sines  of  three  angles  of  a  constant  concave  sum  at 
a  maximum  ? 

The  solution  of  this  problem  is  very  similar  to  that  of  No.  10. 

It  is  based  on  the  theorem:  Of  two  angle  pairs  with  equal  concave  sums  the 
pair  possessing  the  higher  sine  product  is  the  pair  with  the  smaller  difference 
between  its  angles. 

[It  follows  from  the  formulas  that  2  sin  X  sin  Y  =  cos  (X  -  Y)  -  cos  (X  +  F), 
and  2  sin  x  siny  =  cos  (x-y)-  cos  (x  +  y ),  where  X,  Y  and  x,  y  represent  the  two 
pairs  with  the  common  sum 


X  +Y  =  x  +y  (g  180°). 

Since  the  subtrahends  of  the  right  sides  are  equally  great,  the  larger  right  side  is 
the  one  that  possesses  the  greater  minuend,  i.e.,  in  this  case,  the  one  in  which  the 
minuend  shows  the  smaller  angle  difference.] 

Let  the  constant  sum  of  the  three  variable  angles  a,  f,  y  be  3k  (  =  180°).  Now 
if  a,  f,  y  is  such  an  angle  triplet  in  which  none  of  the  angles  chances  to  equal  *, 
then  at  least  one,  let  us  say  a ,  must  necessarily  be  greater  than  *,  and  another,  let 


us  say  f,  must  be  smaller  than  *.  We  form  a  new  triplet  a',  f,  y'  such  that  (1)  a'  = 
*,(2)  the  pairs  a',  f}'  and  a,  f  possess  equal  sums,  and  (3)  y'  =  y.  According  to  the 
above  theorem,  sin  a'  sin  /?'  will  then  be  >  sin  a  sin  f,  and  consequently,  sin  a' 
sin  [T  sin  y'  will  also  be  >  sin  a  sin  ft  sin  y,  or 

(1)  sin  k  sin  /?'  sin  y  >  sin  a  sin  fi  sin  y. 

Since  /?'  +  y'  =  2k,  the  same  theorem  yields 

(2)  sin  k  sin  k  ^  sin  jS'  sin  y . 

Combining  (1)  and  (2),  we  obtain 

sin  k  sin  k  sin  *  >  sin  a  sin  /9  sin  y. 


Consequently: 

The  product  of  the  sines  of  three  angles  of  constant  concave  sum  assumes  its 
maximum  value  when  the  angles  are  equal. 

The  solution  to  our  sailboat  problem  thus  reads  a  =  |3  =  y  =  30o.  This  means 
that: 

The  axis  of  the  boat  must  form  a  60°  angle  with  the  bearing  north,  and  the 
sail  must  bisect  the  angle  formed  by  the  wind  bearing  and  the  boat’ s  axis. 

In  these  optimal  positions  the  northerly  motion  is  equal  to  exactly  i  the 
maximum  southerly  motion. 
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The  Honeybee  Cell  (Problem  by  Reaumur) 


The  cell  of  the  honeybee  (cf.  Figure  107)  has  the  form  of  a  regular  hexagonal 
prism  that  is  sealed  at  only  one  end  by  a  regular  hexagon  arbpcq,  while  at  the 
other  end  it  is  sealed  by  a  roof  consisting  of  three  congruent  rhombuses  PBSC, 
QCSA,  and  RASB  that  are  inclined  toward  each  other  and  toward  the  axis  of  the 
prism  at  equal  angles,  in  such  manner  that  the  lateral  surfaces  of  the  prism  are 
congruent  trapezoids  ( AarR ,  RrbB,  etc.).  The  longest  side  of  one  such  trapezoid 
is  somewhat  more  than  twice  as  long  as  the  diameter  of  the  inscribed  circle  of 
the  base  surface  arbpcq.  As  a  result  of  the  regular  arrangement  of  the 
rhombuses,  each  of  the  three  rhombus  diagonals  ( SP ,  SQ,  SR)  originating  at  the 
roof  apex  S  forms  the  same  angle  with  the  axis  of  the  prism  as  the  rhombus 
plane,  and  the  two  planes  ABC  and  PQR  are  perpendicular  to  the  edges  of  the 
prism.  Since  the  obtuse-angled  rhombus  vertexes  abut  on  each  other  at  S,  the 


diagonals  mentioned  are  the  short  rhombus  diagonals. 


s 


FIG.  107. 

This  singular  construction  of  the  honeybee  cell  suggested  to  naturalists  like 
Maraldi,  Reaumur,  and  others  (at  the  beginning  of  the  eighteenth  century)  that 
the  bees  had  chosen  this  design  in  order  to  save  as  much  as  possible  in  the 
building  material,  i.e.,  in  wax.  The  problem  posed  by  Reaumur  in  this 
connection  to  the  Swiss  mathematician  Koenig  can  be  stated  as: 

To  close  a  regular  hexagonal  prism  with  a  roof  consisting  of  three  congruent 
rhombuses  in  such  manner  as  to  obtain  a  solid  of  prescribed  volume  and 
minimal  surface. 

Solution.  Let  the  regular  hexagonal  cross  section  of  the  prism  have  the  side 
2e,  so  that  its  shorter  diagonals  ab  =  be  =  ca  =  2d  =  2e  N  3  and  thus  also  AB  = 
BC  =  CA  =  2d  =  2e  v  3.  Let  the  distance  of  the  plane  PQR  and  the  apex  S  of  the 
roof  from  the  plane  ABC  be  x,  and  let  the  short  rhombus  diagonals  (SP  =  SQ  = 
SR)  be  2 y. 

Since  the  projection  from  SR  =  2 y  on  the  axis  of  the  prism  is  2x,  and  on  the 
plane  PQR  is  2e,  we  obtain  the  equation 

(1)  y2  =  +  x2. 

If  vp ,  o ,  w  are  the  points  at  which  the  prism  edges  passing  through  P,  Q,  R 
intersect  the  plane  ABC,  then  AWBW  C°  is  a  regular  hexagon  with  the  side  2e. 

First  it  becomes  apparent  that  the  volume  of  the  prism  undergoes  no  change 
when  the  rooflike  closure  that  has  been  described  is  chosen  instead  of  the  plane 


closure  AMP1*  Cc ,  since  as  much  room  is  added  on  the  one  side  of  the  plane 
ABC  (pyramid  SABC)  as  is  taken  away  from  the  other  side  (the  three  pyramids  P 
■  BC ^  Q  •  CA £• ,  R  ■  ABM).  Only  the  surface  changes  with  the  change  in  design; 
the  surface  decreases  by  the  area  6e2y>  3  of  the  hexagon  AMB'V  CG  ,  as  well  as  by 
the  area  of  the  six  right  triangles  PtyB,  PtyC,  QCC,  QOA,  JfiRA,  RRB — 
together  6ex — while  it  increases  by  the  total  area  of  the  three  rhombuses  PBSC, 
QCSA,  RASB,  namely  6 dy  =  6ev"3  y.  The  saving  in  surface  area  thus  obtained  is 
accordingly 


6e2\/^3  +  6ex  —  6eV3  y 


or 


6PV3  -  6t[yV3  -  x], 

so  that  it  now  remains  to  obtain  a  minimum  value  for  the  expression  in  the 
bracket 


u 


X 


by  an  appropriate  choice  of  x. 

Now,  if  v  is  understood  to  be  the  similarly  constructed  expression  xV3  -  y, 
then,  as  a  result  of  (1), 


u2  -  v2  -  2  (ya  -  x3)  =  2P 


or 


u 2  =  2*2  +  v 2. 

From  this  it  follows  that  u  attains  a  minimum  (specifically  when  v  is  equal 
to  zero,  i.e.,  when 

(2)  y  =  xV3. 

From  (1)  and  (2)  we  obtain 


x  —  eV^  and  y  =  eVJ. 


The  diagonal  SR  =  2y  =  eV6  is  consequently  shorter  than  the  diagonal  AB  =  2d 

=  2eV3  =  eV  12,  so  that  the  three  rhombus  angles  abutting  on  one  another  at  S 
are  obtuse.  If  we  designate  the  acute  rhombus  angle  SAR  as  2 (p,  it  follows  from 
tan  <p  =  y\d  =  \fV2  and  tan  2<p  =  2  tan  95/(1  -  tan2  9 0)  that  tan 
tan  2<p  =  V8,  cos  295  =  and  29)  =  70°  32'.  The  obtuse  rhombus  angle  2(j)  is 
therefore  109°  28'. 

For  the  angle  p  of  the  rhombus  diagonals  SP,  SQ,  SR  with  respect  to  the  axis 
of  the  prism  we  obtain  the  relation  tan  ju  =  2e/2x  =  \  2  >  and  thus  p  =  90°  -  9  = 
54°  44'. 

The  angle  v  of  the  rhombuses  with  respect  to  the  prism  cross  section  is, 
finally,  v  =  90 0  -  p  =  9  =  35°  16'. 

Since  the  tangent  of  the  acute  trapezoid  angle  (2i aAR)  has  the  value  2e/x  = 
v/8  =  tan  29),  the  acute  and  obtuse  angles  of  the  trapezoid  correspond  to  the 
acute  and  obtuse  angles,  respectively,  of  the  rhombus. 

Particular  interest  attaches  to  the  angles  enclosed  between  every  two 
bounding  surfaces  of  the  prism.  These  angles  are  easily  determined. 

To  begin  with,  since  the  three-sided  corners  S,  P,  Q,  R  are  congruent  and 
regular  (each  side  is  20),  the  surface  angles  belonging  to  these  corners  are  all 
equal  to  each  other.  Since  the  four-sided  corners  A,  B ,  C  are  also  regular  and 
congruent  (each  side  is  2(p ),  these  corners  also  all  have  the  same  surface  angle. 
Now,  a  surface  angle  of  the  corner  P  at  p  as  fsbpc  equals  120°,  and  a  surface 
angle  of  the  corner  A  at  a  as  f^qar  also  equals  120°. 

Consequently,  all  the  surface  angles  of  the  prism  are  120°  (naturally,  with  the 
exception  of  the  right  angles  forming  the  base  surface). 

The  angles  we  have  just  calculated  have  in  fact  been  confirmed  by  actual 
measurement  for  the  honeybee  cell — within  the  limits  of  observational  error.  Of 
particular  interest  is  the  remarkable  fact  that  every  two  abutting  wax  surfaces 
enclose  an  angle  of  120°. 
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Regiomontanus’  Maximum  Problem 


At  what  point  of  the  earth  s  surface  does  a  perpendicularly  suspended  rod 
appear  longest  ?  (I.e.,  at  what  point  is  the  visual  angle  at  a  maximum  ?) 

This  problem  was  posed  in  1471  by  the  mathematician  Johannes  Muller, 
called  Regiomontanus  after  his  birthplace  Konigsberg  in  Franconia,  to  the  Erfurt 


professor  Christian  Roder.  This  problem,  which  in  itself  is  not  difficult, 
nevertheless  deserves  special  attention  as  the  first  extreme  problem  encountered 
in  the  history  of  mathematics  since  the  days  of  antiquity. 

The  author  of  the  following  simple  solution  is  Ad.  Lorsch,  who  published  it 
in  vol.  XXIII  of  the  Zeitschrift  fur  Mathematik  und  Physik. 

Let  A  be  the  upper  and  B  the  lower  end  point  of  the  rod,  F  the  base  point  of 
the  perpendicular  to  the  earth’s  surface  from  A  (or  B ),  so  that  the  segments  FA  = 
a  and  FB  =  b  are  known.  Since  the  rod  appears  to  be  equally  long  at  all  the 
points  of  a  circle  on  the  earth’s  surface  described  about  F  as  the  center,  it  is 
sufficient  to  erect  an  arbitrary  perpendicular  0  to  FA  at  F  and  to  seek  on  this  line 
that  runs  horizontally  on  the  earth’s  surface  the  point  O  at  which  the  visual  angle 
co  =  -fjlOB  is  a  maximum. 

First  Lorsch  shows  that  the  circle  of  circumscription  ^  of  the  triangle  ABO  is 
tangent  to  the  line  0  at  O.  Indeed,  if  0  were  not  tangent  to  qthen  $  would  have 
another  point  Q  in  common  with  0  besides  point  O,  and  for  each  intermediate 
point  Z  of  0  between  O  and  Q,  yfAZB  would  be  greater  than  the  boundary  angle 
of  the  circle  q  on  AB,  and  it  would  consequently  be  greater  than  co,  whereas  co  is 
supposed  to  be  the  maximum. 

Let  us  therefore  draw  the  circle  q  that  passes  through  points  A  and  B  and  is 
tangent  to  the  line  0 ;  the  point  of  tangency  O  is  the  place  at  which  the  viewing 
angle  of  the  rod  attains  its  maximum  value  co.  Indeed,  if  P  is  any  point  other  than 
O  on  the  line  0,  then  the  angle  APB  is  smaller  than  the  boundary  angle  of  q  on 
AB,  and  consequently  smaller  than  co.  Lorsch  also  shows  the  most  convenient 
and  quickest  method  of  constructing  the  circle  ft  and/or  its  midpoint  M  and 
radius  r.  To  begin  with,  the  midpoint  M  lies  on  the  perpendicular  bisector  of  AB, 
which  runs  parallel  to  the  line  0  and  passes  through  the  midpoint  N  of  AB.  Now, 
in  the  rectangle  MOFN  the  side  FN  is  equal  to  the  opposite  side  MO,  and  is  thus 
equal  to  r,  so  that  all  that  is  necessary  is  to  mark  off  from  B  (or  A)  the  distance 
FN  on  the  perpendicular  bisector  in  order  to  obtain,  at  the  resulting  point  of 
intersection,  the  desired  midpoint  M. 

If  one  wishes  to  determine  the  position  of  O  by  calculation — using  its 
distance  t  from  F — one  need  only  bear  in  mind  that,  according  to  the  tangent 
theorem,  FO2  =  FA  ■  FB.  This  equation  immediately  gives  us  ,  -  x  Zb.- 

An  interesting  variant  of  the  problem  of  Regiomontanus  is  the  Saturn 
problem,  probably  first  posed  by  Hermann  Martus,  the  author  of  the  well-known 
problem  collection: 


At  what  latitude  circle  of  Saturn  does  the  ring  appear  widest? 


Saturn  is  assumed  to  be  a  sphere  with  a  radius  of  56,900  km,  and  the  ring  is 
assumed  to  be  a  circular  ring  in  the  plane  of  Saturn’s  equator,  having  an  inner 
radius  of  88,500  km  and  an  outer  radius  of  138,800  km. 

Solution.  In  Figure  108,  let  the  arc  represent  a  meridian,  M  the  midpoint 
of  Saturn,  AB  the  width  of  the  ring,  MA  =  a  being  the  outer  radius,  and  MB  =  b 
the  inner  radius  of  the  ring,  and  let  MC  =  r  be  the  equatorial  radius  of  Saturn  on 
MA.  Let  O  be  the  point  situated  at  the  latitude  cp  =  y<  CMC)  at  which  the  ring 
width  appears  greatest,  so  that  £AOB  =  i//  is  a  maximum. 

We  now  apply  Lorsch’s  considerations  to  our  figure  and  directly  obtain  the 
following  solution.  We  draw  the  circle  q  that  passes  through  the  points  A  and  B 
and  is  tangent  to  the  meridian  ^  the  point  of  tangency  O  is  the  place  at  which 
the  ring  width  appears  to  be  greatest. 

In  order  to  calculate  the  latitude  (p  of  O  and  the  maximum  <//,  we  examine  the 
right  triangles  MZF  and  AZF,  in  which  Z  is  the  center  of  the  circle  q,  F  the 
center  of  AB.  From  these  triangles,  with  the  understanding  that  p  is  the  radius  of 
q,  we  obtain 


MF 

COS<p=MZ 


a  +  b  ,  .  .  AF  a  —  b 

7T, — - — r  and  sin  0  =  —  =  — 

2  (r  +  p)  AZ  2p 


The  unknown  p ,  however,  follows  from  the  secant  theorem,  according  to  which 
MA  MB  =  MZ 3  -  P 2  or  ab  -  (r  +  p)»  -  »  ra  +  2 rP,,  and  consequently 

P  =  (ab  -  ra)/2r.  If  we  introduce  this  into  the  above,  we  at  length  obtain 


COS  <p 


— : - s-  and  sin  6 

ab  +  ra 


(a-_b)r 

ab  —  ra 


and  from  this,  q>  =  33$°,  </»  =  18$°. 
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The  Maximum  Brightness  of  Venus 


In  what  position  does  the  planet  Venus  appear  to  have  the  greatest  brilliance 

? 

Solution.  Let  the  midpoints  of  the  sun,  earth,  and  Venus  be  S,  E,  V,  the 
radii  of  the  orbits  (assumed  as  circular)  of  the  earth  and  Venus  SE  =  a  and  SV  = 
b,  the  variable  distance  of  Venus  from  the  earth  EV  =  r,  the  radius  of  Venus  h. 
The  tangents  to  Venus  from  S  and  E  touch  Venus  along  circles  1  and  11, 
respectively,  whose  diameters  in  the  plane  SEV  we  will  call  AB  and  CD, 
respectively.  Since  AB  _L  SV  and  CD  _L  EV,  the  angle  between  the  planes  of  the 
two  circles  is  equal  to  the  angle  <p  =  SVE  between  their  normals  VS  and  VE.  The 
projection  of  the  portion  of  Venus  that  is  illuminated  by  the  sun  and  visible  from 
the  earth  on  the  plane  of  circle  11  consists  of  the  semicircle  with  the  central 
radius  VC  and  the  area  z/2)h2  and  the  projection  of  the  semicircle  with  the 
central  radius  VB,  having  the  area  {n/2)h2  cos  <p.  (The  area  of  the  projection  of  a 
plane  surface  on  a  plane  is  equal  to  the  product  of  the  area  of  the  surface  and  the 
cosine  of  the  angle  between  the  two  planes.)  The  radiation  from  Venus  to  the 
earth  is  thus  exactly  the  same  as  that  of  a  surface  at  V  perpendicular  to  the  rays, 
with  the  area 


0 


FIG.  109 

J  =  T nh2[\  +  cos  <p). 

If  1  cm2  of  this  surface  at  distance  1  develops  the  illumination  intensity  c,  the 
entire  surface  generates  the  illumination  intensity  cJ  and  at  the  distance  VE  =  r 
the  illumination  intensity  is 

cJ  cirh2  1  +  COS  9> 

~  —  ~2  ^3 


Accordingly,  the  illumination  intensity  attains  a  maximum  when  the  factor 


r  1  +  COS  <p 

J  73 

reaches  its  peak  value. 

Now,  according  to  the  cosine  theorem  as  applied  to  triangle  SEV, 

r2  +  b2  —  a2 
= - Wf - , 


and  consequently, 


,  _  1  ,  1  a2  -  b2 

J  2 br  +  r2  2 br3  ' 


This  expression  has  the  form 


f  =  Ax  +  Bx2  -  Cx3, 


where 


are  constants  and  x  =  (1/r)  is  a  variable.  We  must  now  make  the  function  f  of  x  as 
great  as  possible  by  a  suitable  choice  of  x.  As  the  curve  of  the  function 
shows, /initially  grows  as  x  (>  0)  increases;  at  a  certain  point  x  =  a  it  attains  its 
maximal  value,  and  then  declines.  For  every  (positive)  x  fa,  therefore, 

Ax  +  Bx2  -  Cx9  <  Aa  +  Ba 2  -  Ca3. 

Accordingly  as  x£a,  we  write  this  inequality  as 

C(a3  -  r5)  <  A(a  -  x)  +  B( a2  -  x2) 


or 


C(r»  -  a3)  >  A(x  -  a)  +  Bix2  -  a2), 

and  divide  both  sides  by  a  -  x  and  x  -  a,  respectively.  From  this  we  find  that: 
The  function  C(a2  +  a  +  x2)  lies  below  the  function  A  +  B(a  +  x)  when  x  <  a,  and 
above  it  when  x  >  a.  Since  these  two  continuous  functions  increase  steadily,  they 
must  attain  equal  values  at  the  point  x  =  a,  so  that 


C(a3  +  a3  +  a3)  =  A  +  B(  a  +  a). 


This  equation  yields 


B  +  VB3  +  3  CA 

a  =  —  « 

3  C 

If  we  introduce  here  the  values  of  A,  B ,  C,  we  find  for  the  desired  distance  r(=  1/ 
a)  the  value 


r  =  V3 a3  +  b3  -  2b. 

Now  all  three  sides  of  the  triangle  SEV  for  the  optimal  position  are  known  {a:b:r 

=  1:0.7233:0.4304),  and  the  sought-for  angular  distance  (^1  SEV)  of  Venus  from 
the  sun  is  found  to  be  39°  43.57 
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A  Comet  Inside  the  Earth’s  Orbit 


What  is  the  maximum  number  of  days  that  a  comet  can  remain  within  the 
earth  s  orbit  ? 


We  will  assume  that  the  earth’s  orbit  is  circular  and  the  comet’s  parabolic,  and 
that  the  orbital  planes  coincide. 

Solution.  We  will  select  the  large  half  axis  of  the  earth’s  orbit  as  the  unit 
length,  the  mean  solar  day  as  the  unit  time,  and  we  will  designate  the  parabola 
parameter  as  4k,  the  base  line  of  the  parabola  section  lying  within  the  earth’s 
orbit  as  2 y,  the  altitude  of  the  section  as  x,  the  sector  described  by  the  focal 
radius  of  the  comet  within  the  earth’s  orbit  as  S,  and  finally,  the  time  required  to 
traverse  the  sector  as  t.  Then 


(1)  y2  *  4 kx 

according  to  the  amplitude  equation  of  the  parabola, 

(2)  (*  -k)3  +y3=\ 

according  to  the  circle  equation,  and 

(3)  3 S  -  y(x  +  3k) 


according  to  the  formula  for  the  area  of  a  parabola  section  [No.  56.  S  =  the 
section  -  triangle  =  \xy  -  (*  -  k)y]. 

If  2 p  represents  the  orbit  parameter  of  a  celestial  body  of  mass  p,  revolving 
about  the  sun  (the  mass  of  the  sun  is  considered  as  the  unit  mass),  if  t  is  any 
time,  S  the  sector  described  by  the  body  in  this  time,  we  can  use  the  Gauss 
formula * 


tVpy/l  +  /i 

where  G  (the  root  of  the  gravitation  constant)  is  the  so-called  Gauss  constant, 
which  has  the  numerical  value  of  0.0172021  for  the  units  assumed. 

Since  the  mass  of  the  comet  relative  to  that  of  the  sun  is  negligible,  the  Gauss 
formula  is  transformed  into 

(4)  S  -  CtVk,  with  C  =  G/V 2 

in  our  problem. 

From  (1)  and  (2)  we  find 


x  +  k  =  1,  y  =»  2v0fc(l  -  k) 
and,  making  use  of  these  values,  we  obtain  from  (3) 

3 S  =  2Vk(\  -  k)(  1  +  2A). 

If  we  introduce  here  the  value  for  S  from  (4),  it  follows  that 
(5)  t  =  c(\  +  2k)VT^k,  with  c  =  V8/3G. 

Since  t  is  to  be  a  maximum,  the  expression  (1  +  2&)v/l  -  k  :  must  be  made  as 
great  as  possible.  It  therefore  remains  to  select  k  in  such  manner  that  the 
expression  or  its  square  or  fourth  power,  namely, 

/>=(!+  2*)-(l  +  2*)-(4  -  4 *), 

becomes  a  maximum.  However,  since  P  is  a  product  of  factors  of  constant  sum, 
it  attains  a  maximum  (No.  10)  when  the  factors  are  equally  great,  thus  when 


1  +  2)t  =  4  -  4A. 


This  gives  us  k  =  \  and,  as  a  result  of  (5),  t  =  78. 

The  sought-for  maximum  possible  length  of  stay  is  thus  78  days. 
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The  Problem  of  the  Shortest  Twilight 


On  what  day  of  the  year  is  the  twilight  shortest  at  a  place  of  given  latitude  ? 


This  problem  was  posed,  but  not  solved,  by  the  Portuguese  Nunes  in  1542  in 
his  book  De  crepusculis.  Jacob  Bernoulli  and  d’Alembert  solved  the  problem  by 
means  of  differential  calculus,  but  obtained  no  simple  results.  The  first 
elementary  solution  stems  from  Stoll  ( Zeitschrift  fur  Mathematik  und  Physik, 
vol.  XXVIII).  The  following  very  simple  solution  is  from  Brunnow’s  Lehrbuch 
der  spharischen  Astronomie  (Textbook  of  Spherical  Astronomy). 

A  distinction  is  made  between  civil  and  astronomical  twilight.  Civil  twilight 
ends  when  the  midpoint  of  the  sun  stands  6i°  below  the  horizon.  Approximately 
at  this  moment  one  must  turn  on  one’s  lights  in  order  to  continue  working. 
Astronomical  twilight  ends  when  the  midpoint  of  the  sun  stands  18°  below  the 
horizon;  it  is  approximately  at  this  time  that  the  astronomer  can  begin  making 
observations. 

It  is  convenient  to  choose  as  the  beginning  of  twilight  the  moment  at  which 
the  midpoint  of  the  sun  is  intersected  by  the  horizon. 

Let  the  latitude  of  the  observation  point  be  cp,  the  pole  distance  of  the  sun  p. 

The  duration  of  the  twilight  is  measured  by  the  angle  d  that  is  formed  by  the 
two-hour  circle  arcs  of  the  nautical  triangles  determined  by  the  sun  for  the 
beginning  and  end  of  the  twilight.  If  we  superimpose  one  of  these  triangles  on 
the  other  in  such  manner  that  the  two  pole  distances  coincide,  the  angle  between 
the  two  latitude  complements  b  (now  having  in  common  only  the  world  pole  P ) 
represents  the  duration  d  of  the  twilight.  In  this  position  let  the  triangles  be  PCX 
and PCY,  with  PC  =  p,  PX=  PY=  b  =  90°  -  cp,  CX=  90°,  CY=  90°  +  h  (h  is  to 
be  understood  as  representing  the  depth  of  the  sun  below  the  horizon  at  the  end 
of  the  twilight),  and  /CXPY=  d.  Moreover,  let  XY  =  u  and  f\XCY=  0. 
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FIG.  110. 

From  the  isosceles  triangle  PXY  it  follows,  according  to  the  cosine  theorem, 
that 


(1) 


cos  d  — 


cos  u  —  sin3  <p 
cos2  <p 


Consequently,  d  becomes  a  minimum  or  cos  d  a  maximum  when  cos  u  is  at  a 
maximum. 

From  the  triangle  CXY  it  follows,  however,  that 

cos  u  =  cos  CX  cos  CY  +  sin  CX  sin  CY  cos  ip 


or,  since  cos  CX=  0,  sin  CX=  1,  sin  CY=  cos  h,  that 

cos  u  =  cos  h  cos  ip. 


Thus,  cos  u  attains  its  greatest  possible  value  when  cos  \|/  is  a  maximum,  i.e., 
when 


0  =  0. 

On  the  day  of  the  shortest  twilight,  point  X  accordingly  falls  on  the  side  CY,  and 
the  base  XT  =  u  of  the  isosceles  triangle  PXY  is  h.  At  the  same  time  we  find  from 
(1)  for  the  minimum  duration  b  of  the  twilight 

cos  h  —  sin2  w 
cos  b  =  - = - - 

COS''  <p 


or,  in  accordance  with  the  two  formulas 


cos  h  =  1—2  sin2  -r. 

n 


(I) 


cos  b 


1  -  2  sin2 


b 

r 


.  h 

sm  2 
cos  <p 


To  find  the  corresponding  declination  of  the  sun  S,  we  express  the  cosine  of 
the  angle  co  =  AfCX  yXPCY  twice  in  accordance  with  the  cosine  theorem  and 
set  the  resulting  values  equal  to  each  other. 

It  follows  from  APCX  (since  cos  CX=  0,  sin  CX=  1)  that 

sin  tp 


from  A PCY  (since  cos  CY=  -  sin  h,  sin  CY=  cos  h)  that 


COS  to 


sin  <p  +  cos  p  sin  h 
sin  p  cos  h 


Equalizing,  we  obtain 


sin  tp  cos  h  =  sin  <p  +  cos  p  sin  h 


or 


—  cos  p  sin  h  =  sin  9(1  —  cos  h) 


or 


0  .  h  h 
—  cos  p  •  2  sin  ^  cos  - 


sin  <p  •  2  sin 


2 


or,  finally, 


cos p  =  —sin  tp  tan  ~ 

Because  of  the  minus  sign,  the  pole  distance  p  is  an  obtuse  angle  for  northern 
latitudes,  the  sun’s  declination  S  is  thus  southerly  and 

(II) 


sin  8  =  sin  tp  tan  -• 


The  shortest  twilight  duration  is  determined  by  (I)  and  the  southerly 
declination  of  the  sun  for  the  day  on  which  that  twilight  occurs  is  given  by  (II). 

From  the  declination  the  sought-for  day  can  be  found  by  means  of  the 
nautical  almanac. 

This  datum  is  also  found  with  sufficient  accuracy  if  the  familiar  formula 
(2)  sin  S  =  sin  e  sin  / 

is  used;  here  3  represents  the  sun’s  declination,  /  the  angular  distance  of  the  sun 
from  the  autumnal  or  vernal  equinox,  and  s  the  inclination  of  the  ecliptic  (23° 
27').  Since  the  above-mentioned  angular  distance  changes  at  an  average  daily 
rate  of  m  =  59.1',  the  sought-for  information  varies  by  n  =  l/m  days  from  the  23rd 
of  September  or  from  the  21st  of  March. 

For  Leipzig,  for  example,  {(p  =  51°  20.1')  we  find,  from  (II),  3  =  7°  6.2',  then 
from  (2),  /  =  18°  6.3',  and  then  n  =  18.4.  The  shortest  twilight  in  Leipzig  thus 
falls  on  October  1 1  and  March  3. 
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Steiner’s  Ellipse  Problem 


Of  all  the  ellipses  that  can  be  circumscribed  about  (inscribed  in)  a  given 
triangle ,  which  one  has  the  smallest  (largest)  area  ? 

“Dans  le  plan,  la  question  des  polygones  d’aire  maximum  ou  minimum 
inscrits  ou  circonscrits  a  une  ellipse  ne  presente  aucune  difficulte.  II  suffit  de 
projeter  l’ellipse  de  telle  maniere  qu’elle  devienne  un  cercle,  et  Ton  est  ramene  a 
une  question  bien  connue  de  geometrie  elementaire”*  (Darboux,  Principes  de 
Geometrie  analytique,  p.  287). 

The  solution  of  the  problem  is  based  on  the  two  auxiliary  theorems: 

I.  Of  all  the  triangles  inscribed  in  a  circle  the  one  possessing  the  maximum 
area  is  the  equilateral. 

II.  Of  all  the  triangles  that  can  be  circumscribed  about  a  circle  the  one 
possessing  the  minimum  area  is  the  equilateral. 

Proof  of  I.  We  call  the  circle  diameter  d,  the  sides  and  angles  of  an  inscribed 
triangle  p,  q,  r  and  a,  f>,  y,  respectively,  the  area  of  the  triangle  J.  Then 


J  -  sin  y 


and 


p  =  d  sin  a,  q  =  d  sin  /J, 


and  consequently, 


J  =  \d2  ■  sin  o  sin  j3  sin  y. 


According  to  No.  92,  the  product  of  the  sines  sin  a  sin  /?  sin  y  of  the  three 
angles  a,  /?,  y  df  constant  sum  (180°)  is  at  a  maximum  when 

«  -  P  =  y(-  60°), 

i.e.,  when  the  triangle  is  equilateral.  The  area  of  this  maximal  triangle  is  ^\'3d2, 
thus  v  27/4-  of  the  area  of  the  circle. 

Proof  of  11.  If  we  designate  the  sides  of  an  arbitrary  circumscribed  triangle 
PQR  as  p,  q,  r,  then  the  tangents  to  the  circle  from  the  vertexes  P,  Q,  R  are  x  =  s 
-p,y  =  s-q,z  =  s-r,  where  s  represents  half  the  perimeter  of  the  triangle 


The  area  J  of  the  triangle  and  the  radius  p  of  the  inscribed  circle  are  given  by  the 
well-known  formulas 


J  —  ps  and  J  =  Vxyzs  (Hero  of  Alexandria). 


These  give  us 


sP2  =  xyz. 


Making  use  of  the  formula  J  =  ps ,  we  write  this  equation  in  the  following  two 
ways: 


(1) 


yz 


—  +  — 


zx 


(2) 


111  1 


yz  zx  xy  J2p 2 


We  now  introduce  the  new  unknowns 


u  =  — » 


w  —  — 


zx 


V 


and  obtain 


U  +  V  +  W  =  -5.  UVW  =  ~f2~2' 

P  J  P 

Since  J  is  supposed  to  be  a  minimum  and  p  is  constant,  uvw  must  attain  a 
maximum. 

A  product  uvw  of  numbers  u,  v,  w  of  constant  sum  (u  +  v  +  w  =  const.) 
reaches  a  maximum,  however  (No.  10),  when  the  numbers  are  equal  to  each 
other:  u  =  v  =  w.  The  circumscribed  triangle  therefore  becomes  smallest  when  yz 
=  zx  =  xy,  i.e.,  when  x  =  y  =  z,  i.e.,  when p  =  q  =  r,  which  proves  11. 

We  find  that  the  area  of  the  smallest  circumscribed  triangle  is  four  times  that 
of  the  maximum  inscribed  triangle,  i.e.,  \/27  p2,  and  for  the  ratio  of  this  area  to 
the  area  of  the  circle  we  obtain  the  improper  fraction  x  27  _• 

Now  for  the  solution  of  the  ellipse  problem!  Let  q  be  any  ellipse 
circumscribed  about  (inscribed  in)  the  given  triangle  abc,  f  its  surface  area,  8  the 
area  of  the  triangle  abc.  We  consider  ®  as  the  normal  projection  of  a  circle  $■, 
whose  surface  area  we  will  call  F.  In  the  projection  the  inscribed  (circumscribed) 
triangle  ABC  of  the  circle,  possessing  an  area  we  will  call  A,  corresponds  to  the 
inscribed  (circumscribed)  triangle  abc  of  the  ellipse.  If  p  represents  the  cosine  of 
the  angle  between  the  plane  of  the  circle  and  the  plane  of  the  ellipse,  then  the 
normal  projection  of  every  surface  lying  in  the  plane  of  the  circle  is  the  p- 
multiple  of  the  surface.  This  gives  us  the  formulas 

f=pF,  8  =  fiA. 

Since  8  is  constant,/ attains  a  minimum  (maximum)  when  the  quotient  f/8  or  the 
equal  quotient  FI  A  reaches  a  minimum  (maximum).  The  latter  quotient,  however, 
according  to  auxiliary  theorem  I.  (II.)  reaches  its  minimal  (maximal)  value 

4*/  y/2j(nl %  2 '  when  the  triangle  ABC  is  equilateral. 

To  establish  more  exactly  the  ellipse  determined  by  this  condition,  we  make 
use  of  the  properties  of  a  normal  projection:  1.  Parallelism  is  not  annulled  by 
projection.  2.  The  ratio  between  parallel  segments  is  maintained  in  projection:  in 
particular,  the  ratio  of  two  segments  of  the  same  line  is  not  altered. 

Now,  the  center  M  of  the  circle  is  the  point  of  intersection  of  the  medians  of 
the  equilateral  triangle  ABC  and  the  diameter  through  C  bisects  the  chords  of  the 
circle  parallel  to  AB.  Consequently,  the  point  of  intersection  of  the  medians  of 
the  triangle  abc  is  the  center  point  m  of  the  sought-for  ellipse,  and  the  ellipse 
diameter  through  c  bisects  the  ellipse  chords  parallel  to  the  side  ab,  so  that  ab 


and  me  are  conjugate  directions  of  the  ellipse.  Now,  since  the  circle  radius  MK 
parallel  to  the  circle  chord  (tangent)  AB  is  equal  to  i/v  3(V3/6i  of  AB,  the  ellipse 
half  diameter  mk  parallel  to  the  ellipse  chord  (tangent)  ab  is  also  equal  to 
l/\/3(V  3/6)  of  ab. 

Result.  Of  all  the  ellipses  that  can  be  circumscribed  about  (inscribed  in)  a 
given  triangle  abc,  the  one  with  the  smallest  (greatest)  area  is  the  ellipse  whose 
midpoint  m  is  the  point  of  intersection  of  the  medians  of  the  triangle  abc  and 
from  which  the  ellipse  half  diameter  to  c  (to  the  center  of  ab)  and  the  ellipse  half 
diameter  parallel  to  ab,  mk  =  ablV3(ab  2V3),  are  conjugate  half  diameters.  The 
area  of  the  ellipse  thus  characterized — the  so-called  Steiner  ellipse — is 

-4= =  (— 7=^  of  the  area  of  the  triangle. 

V27  \Vfil  8 

This  ellipse  can  be  constructed  easily  in  accordance  with  No.  42. 
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Steiner’s  Circle  Problem 


Of  all  isoperimetric  plane  surfaces  (i.e.,  those  having  equal  perimeters)  the 
circle  has  the  greatest  area. 

And  conversely: 

Of  all  plane  surfaces  with  equal  area  the  circle  has  the  smallest  perimeter. 

This  fundamental  double  theorem  was  first  proved  by  J.  Steiner  ( Crelles 
Journal,  vol.  XVIII;  also  in  Steiner’s  Gesammelte  Werke,  vol.  11).  Steiner  even 
provided  several  proofs.  Here  we  will  consider  only  the  one  that  is  based  upon 
the  Steiner  symmetrization  principle. 

First  we  will  prove  the  second  half  of  the  theorem. 

It  is  obviously  sufficient  to  limit  our  considerations  to  convex  surfaces,  i.e., 
those  surfaces  in  which  the  line  segment  connecting  two  arbitrary  points  of  the 
surface  belongs  completely  to  the  surface. 

We  will  first  prove  the  auxiliary  theorem: 

Of  all  trapezoids  with  common  base  lines  and  altitudes  the  isosceles 
trapezoid  is  the  one  the  sum  of  whose  legs  is  smallest. 

Let  ABCD  be  an  arbitrary  trapezoid  with  the  base  lines  BC  and  AD,  the  legs 
AB  and  CD.  Let  the  mirror  image  of  B  on  the  perpendicular  bisector  of  AD  be  B', 
let  the  center  of  CB'  be  C0.  On  the  extension  of  CB  we  set  BB0  =  CC0  and  obtain 
the  isosceles  trapezoid  AB0CqD,  which  has  base  lines  and  altitude  in  common 


with  the  given  trapezoid,  and  consequently  also  the  same  area. 


FIG.  Ill 

If  we  extend  DC0  by  its  own  length  to  H,  we  obtain  the  parallelogram  DCHB ' 
in  which  the  diagonal  DH  is  shorter  than  the  sum  of  the  sides  DC  and  CH\ 

DH  <  DC  +  CH. 

However,  since  DH  =  2-DC0  =  DC0  +  AB0  and  CH  =  DB '  =  AB,  we  obtain 

AB0  +  DC0  <  AB  +  DC. 

Thus,  the  isosceles  trapezoid  has  the  smallest  leg  sum. 

Now  let  3  be  the  surface  having  the  smallest  perimeter  for  the  given  area  J\ 
let  the  perimeter  be  u. 

We  draw  an  arbitrary  line  0  and  divide  3  by  perpendiculars  to  0  into 
trapezoids  ABCD  that  we  select  so  narrow  that  the  arc-shaped  legs  AB  and  CD 
can  be  considered  as  rectilinear.  From  the  points  of  intersection  of  the  dividing 
lines  ...  AD,  BC,...  with  0  we  mark  off  on  the  dividing  lines  on  both  sides  of  0 
the  half  chords  . . .  AD  BC, . . . ,  as  a  result  of  which  we  obtain  the  points  ...  A',  D', 
B',  C',...  and  the  trapezoids  ...,  AB'CD',....  The  new  trapezoid  A'B'CD'  is 
isosceles  and  possesses  equal  base  lines  and  altitude  with  ABCD,  so  that  the  area 
is  also  the  same.  This  gives  us 

(1)  A'B'  +  C'D'  ZAB  +  CD, 

in  which  the  equals  sign  applies  only  when  ABCD  is  also  isosceles. 

Our  method  enables  us  to  obtain  from  3  a  new  surface  5'  with  the  symmetry 
axis  0,  having  the  same  area  as  3  and  a  perimeter,  therefore,  that  cannot  be 
smaller  than  u.  Thus,  the  equals  sign  in  (1)  must  always  apply.  All  trapezoids 
ABCD  are  therefore  isosceles,  and  the  perpendicular  bisector  of  BC  is  an  axis  of 
symmetry  of  3. 

The  surface  3  of  minimal  perimeter  therefore  possesses  an  axis  of  symmetry 
in  every  direction. 


But  such  a  surface  must  be  a  circle! 

Proof.  Let  I  and  II  be  two  mutually  perpendicular  symmetry  axes  of  5,  M 
their  point  of  intersection.  Let  the  mirror  image  of  an  arbitrary  point  P  of  3  on  I 
be  Pj,  and  let  the  mirror  image  of  Pj  on  II  be  P'  =  P12.  Then  PMP'  is  a  straight 
line  and 


MP'  -  MP, 

i.e.,  the  point  M  is  a  midpoint  of  the  surface. 

Now  3  can  only  have  one  midpoint.  Indeed,  if  N  were  a  second  midpoint, 
then  extending  PM  by  its  own  length,  we  would  first  arrive  at  P';  next,  extending 
P'N by  its  own  length,  we  would  arrive  at  a  new  point  P"  of  3;  then  extending  P 
"M  by  its  own  length,  we  would  arrive  at  a  point  P"  of  3;  extending  P"N  by 
itself,  we  would  then  come  to  still  another  point  of  S,  etc.  If  these  operations  are 
represented  graphically  it  will  be  observed  that  in  this  manner  we  would  end  up 
at  some  arbitrary  distance  beyond  the  drawing  paper  (on  which  3  lies),  which  is 
naturally  absurd.  Thus,  ^  has  only  the  one  midpoint  M. 

It  follows  from  this,  further,  that:  This  M  must  belong  to  each  axis  of 
symmetry  of  3 

Indeed,  if  M  does  not  lie  on  the  axis  of  symmetry  a  of  3,  then  we  can  draw 
the  mirror  images  m  and  p  of  M  and  of  an  arbitrary  surface  point  P  on  a,  extend 
pM by  its  own  length  to  the  surface  point p\  and  draw  the  mirror  image  p”  of  p' 
on  a.  Now,  since  p"  is  a  point  of  3,  Pmp”  is  a  straight  line,  and  mp”  =  mP,  this 
would  mean  that  3  had  a  second  midpoint,  m,  and  this  is  impossible. 

Thus,  all  the  axes  of  symmetry  intersect  at  M. 

Now  let  F  be  a  fixed  boundary  point  of  3  and  P  an  arbitrary  boundary  point 
of  3.  Since  the  perpendicular  bisector  of  FP  is  an  axis  of  symmetry  of  3,  it 
passes  through  M.  Therefore, 


MP  =  MF ; 

i.e.,  all  the  boundary  points  of  3  are  equidistant  from  M,  and  the  surface  3  is  a 
circle. 

Consequently ,  of  all  surfaces  of  equal  area  the  circle  has  the  smallest 
perimeter.  We  now  state  conversely: 

Of  all  isoperimetric  surfaces  the  circle  has  the  greatest  area. 

Proof.  Let  the  perimeter / of  an  arbitrary  surface  3,  which  is  not  a  circle,  be 
equal  to  the  perimeter  k  of  the  circular  surface  £.  Let  the  area  of  3  be  F  and  that 


of  $  be  K. 

Now,  if  FZ  K,  we  will  consider  the  circular  surface  ft',  concentric  to  $$,  of 
area  K'  =  F,  and  we  will  let  its  perimeter  be  k'.  Since  covers  ft 

(2)  k'  £  k. 

However — since  the  surfaces  ft'  and  3  have  the  same  area — according  to  the 
theorem  proved  above,  k'  <  fox 

(3)  k'  <  k. 

The  inequalities  (2)  and  (3)  contradict  each  other,  however,  and  thus  the 
assumption  that  F  ■  K  must  be  false.  Consequently,  F  <  K.  Q.E.D. 

The  foregoing  Steiner  proof  of  the  major  isoperimetric  theorem  for  the  circle 
has  certain  weaknesses.  The  same  is  true  of  the  proof  of  the  major  isoperimetric 
theorem  for  the  sphere,  presented  in  the  following  section. 

The  reader  may  learn  how  these  weaknesses  can  be  eliminated  and  the 
Steiner  proof  formulated  in  a  completely  rigorous  fashion  by  consulting  the 
excellent  book  Kreis  und  Kugel  (Circle  and  Sphere)  by  W.  Blaschke. 
Unfortunately,  we  cannot  go  into  these  interesting  investigations  because  of  lack 
of  space. 
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Steiner’s  Sphere  Problem 


Of  all  solids  of  equal  surface  the  sphere  possesses  the  maximum  volume. 

Of  all  solids  of  equal  volume  the  sphere  possesses  the  smallest  surface. 
(Steiner,  Creiles  Journal,  vol.  XVIII;  Steiner,  Gesammelte  Werke,  vol.  II.) 

As  in  No.  99,  we  will  prove  the  second  part  of  the  theorem  first. 

Naturally,  we  will  consider  only  convex  solids,  i.e.,  those  solids  in  which  the 
line  segment  connecting  two  arbitrary  points  on  the  solid  belongs  completely  to 
the  solid. 

Steiner’s  proof  is  based  on  the  principle  of  symmetrization  and  the  theorem: 

Of  all  triangular  prisms  whose  parallel  edges  AA',  BB',  CC’  have  the 
prescribed  lengths  h,  k,  1  and  lie  on  three  given  lines,  the  prism  with  the  plane  oj 
symmetry  normal  to  the  edges  possesses  the  smallest  base  surface  sum  ABC  +  A 
'B'C'. 

Proof.  We  will  designate  the  distances  of  the  edges  from  one  another  as  a,  b, 
c,  so  that 


91  =  \a{k  +  /),  S3  =  \b{l  +  h),  d  =  \c(h  +  k) 


are  the  areas  of  the  three  trapezoidal  prism  faces.  These  areas  are  given 
magnitudes.  We  extend  CB  and  CB  to  the  point  of  intersection  P,  and  CA  and  C 
'A'  to  the  point  of  intersection  Q,  and  obtain  the  tetrahedron  CC'PQ  in  which  for 
brevity  we  will  call  the  surfaces  CCP  and  CC'Q ”  ‘lateral  surfaces”  and  the 
surfaces  CPQ  and  C'PQ  “top  surfaces.” 
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We  determine  the  relations  between  the  areas  J,  J\  ,  o  of  the  tetrahedron 
bounding  surfaces  CPQ ,  CPQ ,  CCP ,  CC'Q,  on  the  one  hand,  and  the  areas  A,  A 
y ,  a  of  the  prism  bounding  surfaces  ABC,  A'B'C',  BB’C'C,  CC'A'A,  AA'B'B, 
on  the  other. 

From  the  ray  theorem  it  follows  that 

CP  £P  l  ,  CQ  _CQ__  / 

^  ;  CB  ~  C'B'  ~  A  CA~  C’A’ 

where  X  is  the  difference  between  /  and  k,  and  fi  is  the  difference  between  /  and 
h.  Now,  since  the  areas  of  similar  triangles  are  in  the  same  proportion  to  each 
other  as  the  squares  of  homologous  sides,  we  obtain  the  relations 

%  l 2  C  l2 

$  -  91  “  P  and  O  -  ®  "  h2 


From  these  we  obtain 

(2)  %  =  a9l,  O  =  /3», 


with 


I2  ,  l2 

“  =  l2  -  k2  and  P  =  l2  -  h2' 


Moreover,  since  the  areas  of  two  triangles  with  a  common  angle  are  to  each 
other  as  the  products  of  the  adjacent  sides  of  this  angle,  we  obtain 

J  CPCQ  J '  CPCQ 

A  ~  CA-CB  a,K  A'  CA'-CB1' 

and  consequently  as  a  result  of  (1), 

(3)  J  =  *A  and  J'  -  *A\ 

where  K  is  the  constant  /2/lp. 

From  (2)  it  follows  that  the  areas  and  o  of  the  lateral  surfaces  of  the 
tetrahedron  are  constant  no  matter  where  the  prism  edges  AA  \  BB\  CC’  happen 
to  lie,  and  from  (3),  that  the  sum  S  of  the  areas  J  and  J'  of  the  top  surfaces  of  the 
tetrahedron  is  *  times  the  sum  £  of  the  areas  A  and  A’  of  the  base  surfaces  of  the 
prism: 

(4)  S  = 

We  will  now  prove  the  auxiliary  theorem:  Of  all  tetrahedrons  with  two  fixed 
corners  C,  C’  and  two  movable  corners  P  and  Q,  that  lie  on  the  fixed  lines  1  and 
11  parallel  to  CC,  the  tetrahedron  in  which  P  and  Q  lie  on  the  perpendicular 
bisector  plane  of  CC’  is  the  one  possessing  the  smallest  area  sum  S  of  its  top 
surfaces  CPQ,  and  C'PQ. 

To  begin  with,  it  is  clear  that  the  tetrahedrons  concerned  all  have  the  same 
volume  V.  (The  base  surface  CC'P  has  the  constant  area  and  the 
corresponding  apex  Q  lies  on  a  fixed  parallel  to  the  plane  CC'P.) 

We  draw  through  the  center  M  of  CC'  the  plane  E  normal  to  CC'  and 
designate  its  points  of  intersection  with  the  lines  1  and  11  as  p  and  q.  Let  P  and  Q 
be  two  (other)  points  anywhere  on  1  and  11. 

We  now  express  the  tetrahedron  volume  V,  first  using  the  tetrahedron  CC'pq 
and  then  the  tetrahedron  CC'PQ. 

For  this  purpose  we  construct  at  C  and  C'  on  the  top  surfaces  Cpq  and  C'pq 
perpendiculars  running  toward  the  inside*  of  these  surfaces  and  designate  their 
point  of  intersection  on  E  as  O. 

We  will  select  the  common  length  of  the  two  perpendiculars  as  our  unit 
length.  The  perpendiculars  from  O  to  the  top  surfaces  CPQ  and  C'PQ  and  to  the 


planes  I  •  CO  and  II  •  CO  we  will  designate  as  x,  x',  m,  n,  the  common  area  of 
the  lateral  surfaces  CC'p  and  COP  as  <jj,  that  of  the  lateral  surfaces  CC'q  and  CC 
'Q  as  C,  and,  finally,  the  areas  of  the  top  surfaces  Cpq,  C'pq,  CPQ,  CPQ  as  i,  i', 
J,  J .  We  then  obtain  for  the  volume  V  of  the  tetrahedrons  CC'pq  and  CCPQ  the 
formulas 

3  V  =  i  +  i '  +  +  nD  and  3V  =  xJ  +  x'J'  4-  +  nC, 

respectively  [where  x,  x',  m,  and  n,  respectively,  are  positive  or  negative 
accordingly  as  O  lies  on  the  inside  or  outside  of  the  bounding  surfaces  CPQ,  C 
PQ,  I  CO,  and  IICC',  respectively].  It  follows  from  this  that 

xJ  +  x'J'  =  I  +  *'. 

If  we  consider  that  the  perpendicular  x  (x")  from  O  to  the  plane  CPQ  (CPQ)  is 
shorter  than  the  oblique  line  OC  ( OO ),  we  see  that  x  and  x'  are  proper  fractions. 
The  left  side  of  the  last  equation  is  therefore  smaller  than  J  +  J'  and  consequently 
also 


;  +  <  j  +  j't 

which  proves  the  auxiliary  theorem. 

We  now  go  back  to  (4).  Since,  according  to  the  auxiliary  theorem,  S  becomes 
a  minimum  when  P  and  Q  lie  on  E,  and,  as  a  result  of  (4),  £  and  S  attain  a 
minimum  at  the  same  time,  then  X  attains  a  minimum  when  the  prism  bounding 
surfaces  ABC  and  A  'B'C'  are  symmetrical  with  respect  to  E.  Q.E.D. 

Note.  The  preceding  proof  assumes  that  one  prism  edge  (I)  differs  from  the 
other  two.  This  limitation  is  of  no  importance,  since  it  is  immediately  apparent 
that  the  theorem  is  true  in  the  case  h  =  k  =  L 

The  continuation  of  the  proof  for  the  major  isoperimetric  theorem  is  similar 
to  that  in  No.  99. 

Let  $  be  the  solid  that  for  a  given  volume  V  has  the  smallest  surface;  let  the 
latter  be  O. 

We  choose  an  arbitrary  plane  E  and  divide  q  by  perpendiculars  to  E  into 
triangular  prisms  ABCA'B'O,  which  we  assume  to  be  so  narrow  that  the 
bounding  triangles  ABC  and  AB'O  belonging  to  the  surface  of  q  can  be 
considered  as  plane  triangles.  From  the  points  of  intersection  of  the 
perpendiculars  ...  AA',  BB',  CC',..  .with  E  we  mark  off  on  the  perpendiculars  on 
both  sides  of  E  the  halves  of  the  segments  ....AA',  BB',  CC’,...,  as  a  result  of 
which  we  obtain  the  points  ...,  a,  a',  b,  b',  c,  c ',....  The  new  prism  abca'b'c' 


possesses  the  symmetry  plane  E  normal  to  the  edges  and,  according  to  the  above 
prism  theorem,  possesses  a  smaller  base  surface  sum  than  ABCA'B'C': 

(5)  abc  +  a'b'c'  £  ABC  +  A'B'C t 

in  which  the  equals  sign  applies  only  if  the  prism  ABCA'B'C'  also  possesses  a 
symmetry  plane  normal  to  the  edges. 

By  means  of  our  procedure  we  obtain  from  $  a  new  solid  q'  with  the 
symmetry  plane  E,  possessing  the  same  volume  V  as  q  and  a  surface  that 
consequently  cannot  be  smaller  than  O.  Therefore,  the  equals  sign  in  (5)  must 
always  apply.  All  prisms  ABCA'B'C'  therefore  possess  one  plane  of  symmetry 
normal  to  the  edges,  the  perpendicular  bisector  plane  of  AA'. 

The  solid  q  having  the  smallest  surface  thus  possesses  a  parallel  symmetry 
plane  for  every  plane. 

Such  a  solid  must,  however,  be  a  sphere! 

Proof.  Let  1,  11,  111  be  three  symmetry  planes  of  q  that  are  normal  to  each 
other,  M  their  point  of  intersection.  Let  the  mirror  image  of  an  arbitrary  point  P 
of  q  on  1  be  P\,  let  the  mirror  image  of  P1  on  11  be  P12,  let  that  of  P12  on  111  be 

Pi 23  =  P'.  Then  PMP'  is  a  straight  line  and 


MP'  =  MP, 


i.e.,  the  point  Mis  a  midpoint  of  q 

Now,  q  can  have  only  one  midpoint.  (Proof  as  in  No.  99.) 
it  then  follows  from  this  that  M  must  lie  on  every  symmetry  plane  of  q 
Indeed,  if  M  does  not  belong  to  the  symmetry  plane  A  of  q,  then  we  can  draw 
the  mirror  images  m  and  p  of  M  and  of  an  arbitrary  point  P  of  the  solid  on  A, 
extend  pM  by  its  own  length  to  the  point  p '  of  the  solid,  and  draw  the  mirror 
image  p"  of/?'  on  A.  Now,  since  p"  is  a  point  of  q,  Pmp"  is  a  straight  line,  and 
mp"  =  mP,  this  would  result  in  a  second  midpoint,  m,  for  q,  which  is  impossible. 
All  the  symmetry  planes,  therefore,  intersect  at  M. 

Now  let  F  be  a  fixed  point  and  P  an  arbitrary  point  of  the  surface  of  q .  Since 
the  perpendicular  bisector  plane  of  FP  is  the  symmetry  plane  of  q,  it  passes 
through  M.  Therefore, 


MP  -  MF\ 

i.e.,  all  the  surface  points  of  q  are  equidistant  from  M,  and  the  solid  q  is  a 
sphere. 


Of  all  solids  of  equal  volume  the  sphere  thus  has  the  smallest  surface. 

We  now  state  conversely: 

Of  all  solids  of  equal  surface  the  sphere  has  the  greatest  volume. 

Proof.  Let  the  surface  O  of  an  arbitrary  solid  $  which  is  not  a  sphere,  be 
equal  to  the  surface  0  of  the  sphere  t .  Let  the  volume  of  $  be  V  and  that  of  t  be 
v. 

Let  us  assume  V  ■  v;  then  let  us  consider  the  sphere  f  ’  concentric  to  f ,  having 
the  area  v’=  V and  the  surface  o'.  Since  •  lies  on  f ', 

(6)  o'  £  o. 

However — since  the  solids  t  and  $  have  the  same  volume — according  to  the 
previously  proved  theorem,  o' <  O,  or 

(7)  o'  <  o. 

The  inequalities  (6)  and  (7)  contradict  each  other.  The  assumption  V  >  v  must 
therefore  be  false,  and  v  >  V,  as  we  asserted. 

*  Gauss,  Theoria  motus  corporum  coelestium  in  sectionibus  conicis  solem  ambientium  (Hamburg, 

1809).  (English  translation  by  G.  H.  Davis  reprinted  by  Dover  Publications,  1963.) 

*  Translation:  “In  a  plane  the  question  of  polygons  of  maximum  or  minimum  area  inscribed  in  or 
circumscribed  about  an  ellipse  offers  no  difficulty.  All  that  is  necessary  is  to  project  the  ellipse  in  such 
manner  that  it  is  transformed  into  a  circle,  and  the  problem  is  reduced  to  a  well-known  question  of 
elementary  geometry”. 

*  The  inside  of  a  bounding  surface  of  a  tetrahedron  is  the  side  on  which  the  tetrahedron  is  situated. 
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